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FOREWORD 


Thirteen  years  ago  the  Office  of  Ordnance  Research  (now  the  Army 
Research  Office-Durham)  organized  an  OOR  Liaason  Group  on  Computers. 

Two  meetings  of  this  group  were  held,  one  in  1959  and  the  other  in 
1960,  to  exchange  information  of  interaat  to  managers  of  ordnance 
computers.  The  Army  Mathematics  Steering  Committee  decided  that  these 
meetings  should  be  revived  ou  an  Army-wide  basis,  and  asked  Dr.  John  H. 
Giese,  Chairman  of  its  subcommittee  on  Numerical  Analysis  and  Digital 
Computers,  to  draw  up  a  format  for,  and  take  charge  of,  the  new  series 
of  conferences.  Dr.  Giese  thought  these  meetings  should  "establish  a 
way  to  exchange  ideas  on  the  Army's  desires,  capabilities,  and  interest 
in  the  field  of  'other-than  business'  applications  of  computers";  and 
they  should  provide  the  AMSC  and  ARO  with  information  on  the  Army's 
needs  for  computers,  requirements  for  assistance  in  research  and  numeri¬ 
cal  analysis  and  other  kinds  of  mathematics.  He  also  suggested  that 
the  title  of  the  conferences  should  be  the  "ARO  Working  Group  on  Compu¬ 
ters".  Two  meetings,  one  in  1962  and  the  other  in  1964,  were  held  under 
this  title.  Starting  in  1965  these  conferences  have  been  held  yearly 
under  the  title  "Army  Numerical  Analysis  Conference". 

Dr,  Giese  has  served  as  chairman  of  all  these  conferences.  Members 

\ 

of  the  subcommittee  on  Numerical  Analysis  and  Digital  Computers  have 
assisted  him  on  some  of  the  planning  details  of  the  meetings.  However, 
most  of  the  responsibilities  of  the  arrangements  were  in  his  hands. 
Thanks  to  his  continuing  efforts,  all  of  the  meetings  have  been  held  at 
a  high  scientific  level.  Speakers  and  attendees  at  these  conferences 


Preceding  page  blank 


ill 


would  like  to  show  their  appreciation  for  all  of  your  efforts,  John, 
by  dedicating  these  Proceedings  to  you.  They  are  sorry  that  you  will 
no  longer  be  serving  as  chairman,  but  they  do  feel  you  have  done  more 
than  your  share  of  work  in  promoting  these  conferences.  We  certainly 
hope  you  will  continue  to  participate  in  future  conferences  in  this 
area. 

The  theme  of  the  1972  Army  Numerical  Analysis  Conference  was  Systems 
Identification.  This  meeting  was  held  on  20-21  April  1972  at  the  Bio¬ 
medical  Laboratory  at  Edgewood  Arsenal,  Maryland.  Dr.  William  J.  Sacco 
served  as  Chairman  on  Local  Arrangements.  All  those  in  attendance  are 
indebted  to  him  for  a  well-planned  conference  and  for  selection  of  some 
of  the  invited  speakers. 

The  Army  Mathematics  Steering  Committee,  the  sponsor  of  those 
conferences,  has  asked  that  these  Proceedings  be  issued  to  Army  scientists 
and  to  others  interested  in  the  science  and  application  of  numerical 
analysis  to  applied  problems.  Members  of  this  committee  would  like  to 
extend  their  thanks  to  the  speakers  for  their  interesting  papers,  and 
to  the  chairmen  and  all  others  who  participated  in  the  conduction  of 
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this  meeting. 
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INITIAL  VALUE  METHODS  FOR  NONLINEAR  BOUNDARY  VALUE 
PROBLEMS  AND  INTEGRAL  EQUATIONS 


Robert  Kalaba 

Biomedical  Engineering  Program 
Department  of  Electrical  Engineering 
University  of  Southern  California 
Los  Angeles,  California 

SUMMARY 

A  technique  has  been  developed  for  transforming  nonlinear  boundary 
value  problems  and  integral  equations  into  Cauchy  systems.  This  provides  an 
analytical  approach  to  nonlinear  problems  which  is  different  from  the  usual 
successive  approximation  and  series  expansion  schemes.  It  is  also  signifi¬ 
cant  computationally,  for  modern  analog  and  digital  computing  machines  can 
solve  initial  value  problems  with  considerable  speed  and  accuracy.  There 
are  implications  for  stochastic  nonlinear  equations. 

Applications  of  this  new  approach  in  biology,  physics,  and  engineering, 
both  analytically  and  computationally,  are  sketched. 

1.  Introduction 

During  the  early  1950’s  I  recognized  that  computing  machines  would  be 
able  to  solve  large  systems  of  nonlinear  ordinary  differential  equations, 
provided  that  a  complete  set  of  initial  conditions  is  known.  The  study  of 
some  physical  systems  leads  directly  to  such  initial  value  problems;  the 
study  of  others  does  not.  Clearly,  an  important  task  would  be  the  transforma¬ 
tion  of  integral  equations  and  boundary  value  problems  into  initial  value 
problems  to  take  advantage  of  this  great  new  computational  ability.  There 
were  two  early  hints  that  this  could  be  done:  Ambarzumian  showed  that  the 
reflecting  properties  of  a  slab  could  be  found  without  first  determining 
the  entire  internal  field  [1],  and  Davidenko  [2]  showed  that  nonlinear 
transcendental  equations  could  be  reduced  to  initial  value  problems. 

In  a  long  series  of  papers  [3-8]  my  colleagues  and  I  have  shown  how  to 
transform  many  important  integral  equations  and  boundary  value  problems  of 
applied  mathematics  into  Cauchy  systems.  These  ideas  have  been  productive 
both  computationally  and  analytically.  For  the  most  part  these  systematic 
earlier  considerations  have  been  for  linear  systems,  though  there  have  been 
some  exceptions  [9], 

In  recent  months  we  have  found  general  methods  for  converting  nonlinear 
boundary  value  problems  and  nonlinear  integral  equations  into  Cauchy  systems. 
No  use  of  the  usual  successive  approximation  or  series  expansion  techniques 
is  made.  Let  us  now  take  up  a  special  case  to  indicate  the  approach.  Then  in 
§  3  we  cover  numerical  aspects.  Next  stochastic  equations  and  nonlinear 
integral  equations  are  treated.  Initial  value  problems  in  one  parameter  are 
transformed  into  an  initial  value  problem  in  another  in  §6,  A  broad  program 
of  applications  in  biology,  physics  and  engineering  is  presented  in  §7. 


The  remainder  of  this  paper  has  been  reproduced  photographically  from 
the  author's  manuscript. 


2.  A  Nonlinear  Boundary  Value  Problem  [lO] 

To  illustrate  the  reduction  of  a  nonlinear  boundary  value  problem  to 
a  Cauchy  system  we  consider  the  problem 

(1)  u  (t)  =  Xf(u(t)),  0  <  t  <  1, 

(2)  u(0)  =  u(l)  =  0  , 

and  assume  that  a  unique  solution  exists  for  0  <  X  <  A.  As  usual,  the  dots 
over  a  variable  indicate  differentiation  with  respect  to  t.  Since  the  solution 
u  is  a  function  of  X,  as  well  as  t,  we  shall  write 

(3)  u  =  u(t,  X)  ,  0  <  t  <  1  , 

0  <  X  <  A  . 

Equations  (1)  and  (2)  become,  in  this  expanded  notation, 

(4)  u  (t,  X)  =  Xf(u(t,  X)),  0  <  t  <  1  , 

(5)  u(0,  X)  =  u(l,  X)  =  0  . 

Assuming  appropriate  differentiability  properties  we  find  that  the  function  u^ 
satisfies  the  linear  boundary  value  problem 

(6)  [ujjt,  X]“  =  f(u(t,  X))  +  Xf'(u(t,  X))u^(t,  X)  , 

0  <  t  <  1  , 

(7)  ux(0,  X)  =  ux(l,X)  =  0  , 

where,  as  usual,  the  subscript  denotes  a  partial  derivative  with  respect  to  X. 

To  solve  equations  (6)  and  (7)  for  u^  consider  the  function  w,  the  solution 


of  the  linear  problem 


1 


(8)  w(t.X)  =  g(t,\)  +  Xf’(u(t,X))  w(t.X)  . 


0  <  t  <  1  . 


(9)  w(0,  X)  =  w(l,  X)  =  0  . 


In  terms  of  an  appropriate  Green's  function  G,  for  an  arbitrary  forcing 
function  g  the  function  w  is  given  as 


(10)  w(t,  X)  =  J  G(t.y'.X)  g{y',  X)dy'  . 
0 

0  <  t  <  1  , 
0  <  X  <  A  . 


It  follows  that  the  function  u^may  be  represented  in  the  form 
1 

(11)  u.(t,X)=J  G(t,y',X)  f(u(y\X))  dy'  , 

K  0 

0  <  t  <  1  , 

0  <  X  <  A  . 


This  is  viewed  as  a  differential  equation  for  the  function  u,  the  independent 
variable  being  X.  The  initial  condition  at  X  =  0  is 


(12)  u(t,  0)  =  0 , 


0  <  t  <  1  , 


according  to  equations  (4)  and  (5). 

Next  we  obtain  a  differential  equation  and  an  initial  condition  for  the 
Green's  function  G.  From  equation  (10)  we  notice  that 


i 

(13)  wx(t,X)  =  J  G^(t,y',  X  )  g(y',  X)  dy' 


+  J  G(t,  y\  X)  g\(y',  X)  dy'  . 

0 

On  the  other  hand,  we  obtain  a  two  point  boundary  value  problem  for  the 
function  w,  from  equations  (8)  and  (9).  It  is 


(14)  Cw^(t,  X)]"  =  g^(t,  X)  +  f'(u(t,  X))w(t,  X) 


+  X£"(u(t,X))  ux(t.X)  w(t,  X) 


+  X£'(u(t,  X))wx(t,  X)  , 


(15)  wx(°»  =  WX^’  ^  =  0  * 


According  to  equations  (8),  (9)  and  (10)  the  solution  of  equations  (14)  and  (15)  is 

1 

(16)  w,(t,  X)  =  J‘  G(t,  y',  X)Cg,  (y*.  X)  +  (f'(u(y',  X)) 

a  o 


+  Xf"(u(y',X))  ux(y',  X))  w(y',X)]dy'  . 


It  is  now  convenient  to  introduce  the  auxiliary  variable  M, 


(17)  M(t,  X)  =  £'(u(t,  X))  +  Xf"(u(t,  X))ux(t,  X  ) 

1 

=  £'(u(t,  X))  +  Xf " (u(t,  X))  J'  G(t,  y\  X)  f(u(y\  X))  dy'  , 

0 


0  <  t  <  1  , 


0  <  X  <  A  . 


Equation  (16)  then  becomes 

,1 

(18)  w,  (t,  X)  =  J  G(t,  y\  X)  g,(y',  X)dy' 
a  o 

1 

+  J  G(t,y',X)  M(y\ X)  w(y\ X)  dy' 

0 

,1 

=  J  G(t,  y',  X)  gx(y',  X)  dy  * 

0 

.1  1 

+  J  G(t,  y't  X)  M(y X)  J  G(y',  y,  X)  g(y,  X)  dy  dy'  . 
0  0 
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In  view  of  the  two  representations  for  the  function  w^  in  equations 
(13)  and  (18)  and  the  arbitrariness  of  the  function  g,  we  see  that 

1 

(19)  Gx(t,y,X)  =  J*  G(t,y',X)  M(y',  X)  G(y',y,X)  dy'  , 

0 

0  <  t,  y  <  1  , 

0  <  X  <  A  , 

where  the  function  M  is  given  in  equation  (17).  The  initial  condition  on  the 
Green's  function  G  at  X  =  0  is  known  to  be 

Ar(t-l),  0<y<t, 

(20)  G(t,  y,  0)  =  / 

(^t(y-l),  t  <  y  <  1  . 

The  desired  Cauchy  system  for  the  functions  u  and  G  consists  of  the 
differential  equations  in  equations  (11)  and  (19),  the  auxiliary  relation  (17) 
for  the  variable  M,  and  the  initial  conditions  in  equations  (12)  and  (20). 

It  is  a  straightforward  matter  to  establish  that  a  solution  of  the 
Cauchy  system  provides  a  solution  of  the  original  two  point  boundary  value 
problem. 

3.  Numerical  Aspects 

Based  on  much  previous  experience  [3,4,8]  we  believe  that  the  method 
of  lines  [  12 3  provides  an  effective  approach  to  the  numerical  solution  of  the 
Cauchy  system  just  given.  The  basic  idea  is  to  approximate  the  integrals  on 
the  interval  (0,1)  by  means  of  a  quadrature  formula.  In  that  way  the  differ¬ 
ential-integral  equations  are  approximated  by  a  system  or  ordinary  differential 
equations  for  which  the  independent  variable  is  X.  Since  a  complete  set  of 
initial  values  for  u  and  G  known,  at  X  =0,  the  original  boundary  value  problem 
is  reduced  to  a  system  of  ordinary  differential  equations  subject  to  known 
initial  conditions.  Modern  digital,  analog  and  hybrid  computers  are  well- 


suited  for  this  task.  We  routinely  integrate  systems  of  order  10  or  so  in  the 
year  1972. 

Let  us  use  the  approximation 

1  N 

(1)  J  f (y ' )d y 1  =  E  f(r.)  w.  . 

0  i=l 

Then  equation  (11)  of  the  previous  section  is  approximated  by  the  ordinary 
differential  equations 

N 

(2)  du.(X)/dX  =  E  G..(X)  f(u.(X))  w  , 

x  j=1  xj  J  J 

i  =  1|  2f  •  •  •  |  M  $ 

where 

(3)  u.(X)  =  u(ri#  X) 
and 

(4)  G  (X)  =  G(r.,r  ,  X)  , 

XJ  J 

i,  j  =  1,  2, . . .  ,  N  • 

Equation  (19)  becomes 

N 

(5)  dG..(X)/dX  =  EG.  (X)  M(X)  G  .(X)  w  , 

'  '  n  i  im  m  mi  m 

J  m=i 

ii  j  =  1,  2, . . . ,  N  , 

2 

in  an  obvious  notation.  Thus  there  are  N  +  N  ordinary  differential  equations 
with  evident  initial  conditions  from  equations  (22)  and  (20).  In  addition  the 
analogue  of  equation  (17)  is  required. 

We  have  done  [lO]  trial  computations  with  f(u)  =  exp(u).  We  approxi¬ 


mated  the  integrals  by  using  Simpson's  rule  with  twenty  intervals.  The 


resulting  system  of  ordinary  differential  equations  was  integrated  for  0  <  X  <  1 
and  gave  accuracy  to  within  one  part  in  five  thousand.  This  demonstrates  the 
computational  feasibility  of  the  method.  In  view  of  the  known  discontinuity  in 
the  derivative  of  G(t,  y,  X)  with  respect  to  y  at  y  =  t,  it  would  be  desirable  to 
find  ways  to  make  the  computation  as  efficient  as  possible  and  to  compare  it 
against  other  standard  methods  such  as  quasilinearization  [  13 ] .  Where  the 
parameter  study  in  X  is  required,  the  efficiency  of  the  proposed  method  is 
beyond  dispute.  Even  if  the  solution  of  the  nonlinear  boundary  value  problem 
is  desired  for  only  X  =  A,  the  proposed  method  is  interesting  analytically  and 
possibly  numerically,  for  no  solving  of  linear  algebraic  equations  is  required. 

4.  Nonlinear  Stochastic  Equations  and  Other  Matters 

In  the  previous  section  we  have  indicated  how  to  produce  numerically 
the  function  u(t,  X),  0  <  t  <  1,  for  0  <  X  <  A,  where  u  is  the  solution  of  the 
nonlinear  two-point  boundary  value  problem  in  equations  (1)  and  (2).  There 
are  at  least  three  advantages  in  being  able  to  produce  the  function  u  for  all 
of  these  values  of  X.  In  the  first  place,  it  automatically  provides  a  "parameter 
study"  which  is  often  required  in  engineering  and  biological  applications. 

Secondly,  it  provides  a  way  of  treating  stochastic  nonlinear  boundary 
value  problems.  First  determine  u  =  u(t,  X)  as  above  for  0  <  X  <  A.  Then 
suppose  that  X  is  a  random  variable  having  the  probability  density  function 
p  =  p(X)  for  0  <  X  <  A.  Let  the  m^  moment  of  u(t)  be  denoted  by  Mm(t), 

0  <  t  <  1.  Then  we  have 

A 

(1)  M  (t)  =  J'  um(t,  X)  p(X)  dX  ,  0  <  t  <  1  , 

m  o 

which  is  readily  evaluated  numerically. 

Thirdly,  a  uniform  approach  to  system  identification  problems  is 
provided.  Suppose  that  b^,  b^, . . . ,  b^  are  observed  values  of  u(L ),  and  we  wish 
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to  select  X  so  that  we  minimize 

R 

(2)  S(X)  =  E  [u(t.,  X)  -b.]6  , 
i=l  1  1 

where  equations  (1)  and  (2)  of  2  hold. 

Transforming  the  boundary  value  problem  into  a  Cauchy  system,  as 
we  have  explained,  puts  the  problem  in  a  form  for  which  much  is  known  [l3]. 
It  also  makes  possible  the  use  of  gradient  techniques  for  effecting  the  mini¬ 
mization. 


5.  Nonlinear  Integral  Equations  [l4 3 

A  broad  and  important  class  of  nonlinear  integral  equations  has  the 

form 

1 

(1)  u(t)  =  g(t,  X)  +  X  J  k(t,  y,  X,  u(y))  dy  , 

0 

0  <  t  <  1  . 


The  parameter  X  may  lie  in  an  interval  (0,  A).  To  emphasize  the  dependence 
of  the  unknown  function  u  upon  the  parameter  X,  as  well  as  upon  the  variable  t, 
we  shall  write 

1 

(2)  u(t,  X)  =  g(t,  X)  +  X  j*  k(t,  y,  X,u{y,  X))  dy  , 

0  <  t  <  i  , 

0  <  X  <  A  . 


Then,  under  suitable  regularity  properties,  it  is  possible  to  demonstrate  the 
equivalence  between  the  nonlinear  integral  equation,  Eq.  (2),  and  the  Cauchy 
system  for  u  and  the  auxiliary  function  K 

1 

(3)  u,(t,  X)  =  Y(t,  X)  +  X  J*  K(t,y',  X)Y(y',X)dy'  , 

*  0 
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1 

(4)  Kx(t,  y,  X)  =  Q(t,  y,  X)  +  X  ^  K(t,  y',  X)  Q(y\  y,  X)  dy'  , 

0  <  t,  y  <  1  , 

0  <  X  <  A  , 

(5)  u(t,  0)  =  g(t,  0)  , 

(6)  K(t,  y,  0)  =  ku(t,  y,  0,  g{y,  0))  , 

0  <  t,  y  <  1  , 

where  the  functions  Y  and  Q  are  certain  functionals  on  u  and  K. 

Preliminary  numerical  experiments  in  which  g(t,X)  =  1  -  (X/2)t  and 
2 

k(t,  y,  X,u)  =  tyu  have  shown  the  feasibility  of  the  method.  That  X  =  3/2 
is  a  bifurcation  point  is  obtained  effortlessly,  for  the  auxiliary  function  K 
becomes  infinite  there.  How  to  continue  the  solution  through  such  a  point 
is  a  matter  of  great  interest. 

6.  Initial  Value  Problems 

Consider  the  Cauchy  system 

(1)  x(t,  X)  =  f(x(t,  X),  X),  0  <  t  <  T 

(2)  x(0,x)  =  c, 

where  X  is  a  parameter  lying  in  the  interval  (0,  A).  Frequently  we  desire  a 
parameter  study  in  X  of  the  solution  of  the  equations  (1)  and  (2).  One  procedure, 
of  course,  is  to  solve  the  system  as  an  initial  value  problem  in  t  for  various 
values  of  X.  However,  there  is  an  alternative:  transform  the  system  (1)  and 
(2)  into  a  Cauchy  system  in  which  X  becomes  the  time-like  variable.  Such  a 
system  is 


(3)  xx(t,  M  =  M(t,  X) 

t 

(4)  M(t,  X)  =  J‘  [cp(t,  X)/cp(y,  X)]f.(x(y,  X),  X)dy 

0  K 

(5)  x(t,  0)  =  g(t) 

t 

(6)  cpx(t,  X)=  J  [cp(t,  X)/Cp(y,  x  )]  V(y,  X)  dy 

0 

(7)  V(t,  X)  =  f^xd.  X),X)M(t,X) 

+  fxX(x(t,X),X)cp(t,  X) 

(8)  cp(t,  0)  =  h(t), 

0  <  X  <  A  , 

°  <  t  <  T  . 

The  initial  conditions  at  X  =  0  in  equations  (5)  and  (8)  are  obtained  by 
integrating  equations  (1)  and  (2)  for  X  =  0  and  by  integrating  the  system 

(9)  9>(t,  0)  =  fx(x(t,0),  O)cp(t.O),  0<t<T, 

(10)  cp(0,0)  =  l. 

Under  some  circumstances  it  might  be  preferable  numerically  to  solve  the 
system  (3)  -  (10)  rather  than  the  system  (1)  and  (2).  This  remains  to  be 
investigated. 

There  are  similar  discussions  for  the  conversion  of  initial  value 
problems  for  partial  differential  equations  into  Cauchy  systems  in  which  a 
selected  parameter  becomes  the  time -like  variable. 

7.  Applications 

I  have  in  mind  applications  in  physics,  engineering,  and  biology. 
Electromagnetic  theory  and  radiative  transfer  should  be  investigated.  One 
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of  the  principal  nonlinear  integral  equations  of  radiative  transfer  is 

(1)  cp(rj)  =  1  +  A  rjcp  (Ti)  J*  ^-d§  , 

0  <  r\  <  1  , 

0  <  X  <  1. 

A  start  on  the  study  of  this  nonlinear  integral  equation  using  initial  value 
methods  has  been  made  [  15  ];  in  fact,  successful  calculations  have  been 
performed,  but  much  remains  to  be  investigated.  The  behavior  of  the 
solution  of  the  associated  Cauchy  system  near  X  =  1,  a  bifurcation  point,  is 
interesting. 

The  theory  of  optimal  filtering,  detection,  and  control  abounds  with 
integral  equations,  many  of  which  are  nonlinear  [16 ].  These  should  be 
studied  with  emphasis  on  filtering  of  physiological  data,  as  well  as  communi¬ 
cation  and  radar  signals.  The  detection  of  arrhythmias  in  coronary  patients 
is  a  possibility. 

The  theory  of  thin  shells  of  revolution  [17 ]  depends  upon  solving 
nonlinear  systems  of  coupled  integral  equations.  Here  we  have  to  derive 
the  appropriate  Cauchy  systems  and  then  do  test  calculations.  This  is  an 
extension  of  our  wor?  on  the  linear  integral  equations  of  elasticity  theory  [18 3  . 
Applications  to  biomechanics  should  be  stressed,  especially  the  study  of 
trauma  due  to  a  blow  to  the  head. 

Nonlinear  integral  equations  are  used  to  describe  lateral  inhibition 
in  neural  systems  [  19 3 •  The  dependent  variable  is  a  function  of  several  spatial 
variables.  Computational  solution  using  initial  value  methods  is  a  challenge. 

Nonlinear  boundary  value  problems  and  integral  equations  abound  in 
the  study  of  fluid  and  electrolyte  transport  in  physiological  systems  [20,21,22], 


Here  the  biological  interpretation  of  the  Cauchy  system  will  be  particularly 
interesting. 

A  nonlinear  differential  equation  with  nonlinear  boundary  conditions 
is  treated  in  [23]. 

8.  Discussion 

In  the  previous  pages  I  have  adumbrated  a  uniform  approach  to  non¬ 
linear  boundary  value  problems  and  integral  equations.  I  feel  that  it  will 
become  as  effective  for  nonlinear  problems,  in  this  age  of  computing  machines, 
as  the  eigenfunction  expansion  technique  was  for  linear  problems  in  pre¬ 
computer  days.  It  possesses  the  great  merits  of  being  simple  in  concept, 
broad  in  application  and  effective  in  computation. 
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OPTOELECTRONIC  COMPUTATIONAL  TECHNIQUES  FOR  FAST  PICTURE  PROCESSING 

JENNY  BRAMLEY 

Geographic  Information  Systems  Division 
U.  S.  Army  Engineer  Topographic  Laboratories 
Fort  Belvoir,  Virginia  22060 

In  an  earlier  report^-,  I  have  shown  how  the  use  of  analog  TV- 
type  techniques  can  greatly  reduce  both  the  cost  of  operation  and 
the  time  required  for  the  processing  of  pictorial  information.  The 
inherent  limitation  on  the  precision  of  a  TV-type  computer  is  not 
serious  in  this  case  since  there  is  a  comparable  limitation  on  the 
accuracy  of  the  experimentally  obtained  picture  data.  That  earlier 
procedure  was  strictly  sequential.  The  relatively  high  processing 
speed  was  the  result  of  the  elimination  of  drawbacks  inherent  in 
the  digital  computer,  namely  the  delays  in  the  input  and  output  func¬ 
tions  and  the  need  for  piecewise  operation  due  to  the  inadequacy  of 
the  memory  for  the  amount  of  data  being  processed. 

Further  consideration  of  possible  analog  approaches  indicated 
that  the  addition  of  state-of-the-art  (though  not  necessarily  off- 
the-shelf)  optoelectronic  devices  allows  an  increase  in  the  previous 
operating  speed  by  three  orders  of  magnitude.  Thus  this  report 
makes  the  earlier  one  obsolete.  The  speed-up  is  due  to  the  use  of 
parallel  or  near-parallel  processing  techniques. 

To  keep  this  presentation  within  a  reasonable  length,  I 
restrict  the  discussion  to  the  following  four  operations: 


I 


(1)  convolution,  (2)  Fourier  transforms,  (3)  filtering, 
(4)  algorithms. 


(1)  The  basis  of  the  convolution  approach,  as  well  as  of  the 
other  operations,  is  the  imparting  of  information  to  a  light  beam 
by  passing  it  through  a  transparency.  Consider  two  transparencies 
A  and  B  with  fiducial  marks  to  determine  registry  and  relative 
orientation.  Let  resolution  element  (x,  y)  of  A  have  an  absorption 
ar^  =  -log  f^(x,  y)  >  0  (assuming  that  the  function  f^(x,  y)  <  1) . 

The  intensity  U  of  a  beam  of  parallel  light  passing  through  (x,  y) 
is  transformed  into  Uf^(x,  y) .  Similarly,  if  the  element  of  B 
superimposed  on  element  (x,  y)  of  A  is  specified  by  (s-x,  t-y)  and 
has  an  absorption  ~  “log  f^Cs-x,  t-y),  then  after  passing  through 
the  set,  the  light  beam  has  an  intensity  Uf^(x,  y)f2(s-x,  t-y).  The 
total  amount  of  light  transmitted  through  the  set  AB  (usually 
measured  by  a  photomultiplier)  is: 

I(s,  t)  =  |sjuF1(x,  y)f2(s-x,  t-y)dxdy 

The  integration  is  performed  over  the  area  S  being  investigated. 
The  function  I(s,  t)  is  the  convolution  integral. 


In  principle,  the  change  in  the  relative  positions  of  A  and 
B,  which  would  give  rise  to  different  values  of  s  and  t,  could  be 
achieved  by  mechanical  means.  However,  for  an  operation  of 
interest  in  picture  processing,  this  brute  force  approach  is  grossly 
inadequate  from  the  point  of  view  of  achievable  speed  and  the  spec¬ 
ification  of  relative  orientation.  I  propose  replacing  it  with  the 
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following  optoelectronic  arrangement 


photocathode 


magnetic  deflection  coil 


phosphor  3creen 


FIGURE  1 

Image  Converter  Tube 


My  "central  processor"  is  a  magnetically  focused  image  converter 
tube  (schematized  in  Fig.  1)  with  flat  fiber  optics  plates  at  input 
and  output.  This  is  an  off-the-shelf  item.  Transparency  A  is 
mounted  directly  in  front  of  the  faceplate,  and  transparency  B, 
which  is  several  times  smaller  than  A--to  permit  its  correlations 
with  different  portions  of  A--is  mounted  directly  in  front  of  the 
photocathode.  This  arrangement  eliminates  the  use  of  lenses  with 
attendant  loss  of  light  and  provides  for  compactness.  Transparency  B 
is  illuminated  and  forms  the  input  on  the  photocathode.  A  deflection 
coil,  such  as  used  in  the  Goodyear  Correlatron,  permits  scanning  the 
luminescent  output  image  across  the  faceplate  in  any  desired  pattern. 
At  each  position,  the  light  from  the  luminescent  image  of  B  passes 
through  A  and  is  picked  up  by  a  photomultiplier  in  an  arrangement 
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that  is  conventional  for  flying  spot  scanners.  The  position  of  A 
is  specified  by  fiducial  marks,  that  of  B  is  determined  by  the 
current  passing  through  the  deflection  coil.  The  output  of  the 
photomultiplier  can  be  recorded  or  displayed  in  any  conventional 
manner  to  specify  the  occurrence  and  magnitude  of  the  maximum. 

The  determination  of  each  correlation  value  can  proceed  at 
rates  standard  for  a  flying  spot  scanner,  no  matter  how  large  the 
area  being  correlated.  The  limiting  speed  factor  is  the  phosphor 
decay  time.  In  the  case  of  a  P16  phosphor  screen,  a  high  degree 
of  accuracy  may  call  for  a  rate  somewhat  slower  than  conventional 
TV  time  per  deflection,  e.g.,  \  p,sec  per  correlation.  Allowing 
for  retrace  times,  this  provides  more  than  2000  correlations  per 
millisecond.  The  main  time  delay  arises  in  changing  pictures  because 
rigorous  alignment  of  picture  A  is  critical.  Since  picture  B  is 
displayed  on  an  image  converter  tube,  the  controls  are  electronic 
and  more  easily  achieved.  However,  with  suitable  fiducial  marks, 
the  alignment  of  picture  A  can  probably  be  automated  to  require 
only  a  fraction  of  a  second. 

(2)  The  Fourier  transform  of  a  picture  can  be  treated  as  a 
special  case  of  correlation,  based  on  the  following  mathematics: 

Let  g(n,  m)  (n,  m=l  ,  ...  ,  N)  represent  the  intensity  at  every 
point  (n,  m)  of  a  picture  NX  N  elements  considered  as  a  matrix. 

The  first  index  numbers  the  rows  and  the  second  one  the  columns.  Take 
the  cosine  transform  as  an  example.  The  coefficients  in  it  are  defined 
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N 

C(n,  m)  =  s(n>  p)cos2TTmp/N , 

|i,=1 

or  in  terms  of  a  positive  processing  function  f  ^  ^  (p)  =  1  +  cos2tt  mp/N 

c 

N  (m)  N 

C(n,  m)  =  g(n,  p)f  (p)  -  *S~  g(n,  p)  (1) 

p=l  C  p=l 

The  first  term  on  the  right-hand  side  of  Eq.  (1)  is  a  convolution-- 
expressed  as  a  finite  sum.  Using  an  image  converter  tube  and  an 
optical  processing  plate,  all  the  multiplications  and  additions  in 
Eq.  (1)  can  be  performed  in  parallel.  The  processing  plate  is  a 
transparency  of  N+l  parallel  strips,  each  strip  having  the  width 
of  a  resolution  element  of  the  picture  and  a  length  N  times  the  width. 

The  transmittivities  of  the  successive  strips  are  kf  kf  ^  \  ...f  kf 

C  c. 

k,  k(l  +  cos  2Tip/N),  k(l  +  cos  4rrp/N),  . . .  ,  2k  k  <  0.5 
where  p  is  the  running  index,  which  assumes  all  integer  values  from  1 
to  N.  The  quantity  k  is  a  constant  of  the  plate.  While  the  prepara¬ 
tion  of  such  a  plate  entails  time  and  expense,  it  is  a  onetime  operation. 

To  obtain  the  coefficients  C(n,  m)  of  the  transform,  line  n  of 
the  picture  is  projected  successively  on  the  N+l  transmissive  strips. 

The  light  is  then  focussed  on  a  photomultiplier  as  in  the  correlation 
operation.  In  suitable  units,  the  total  amount  of  light  transmitted 
through  strips  1,  2,  and  m+1  is 


respectively. 


kf  g(n,  p.)  *  gn>  £  (1)(^)g(n,  |0 . 

^rrl  ^=1 

(2) 

r3L  (m)  ...  s 

f  (p.)g<n»  M-) 

c 

U"1 


Hence  to  obtain  the  coefficients  C(n,  m)  of  the  Fourier  transform 
for  any  value  of  m  (and  a  given  line  n) ,  we  store  gn  and  take  the 
difference  between  the  photomultiplier  output  after  illumination  of  the 
line  m+1  and  the  stored  signal  gn-  The  proportionality  coefficient  k 
can  be  determined  by  calibration. 

This  scheme  allows  each  transform  coefficient  to  be  obtained  in 
a  single  step  instead  of  requiring  N  successive  multiplications  as  in 
the  case  of  a  conventional  serial  computer.  To  implement  it,  we  use 
the  same  arrangement  as  for  convolution.  The  processing  plate  is 
picture  A,  while  picture  B  is  a  transparency  of  N  X  N  resolution 
elements.  (N/s/1000).  It  is  illuminated  one  horizontal  resolution 
line  at  a  time  and  is  imaged  on  the  photocathode  of  the  image 
converter  tube.  By  means  of  magnetic  deflection,  it  is  placed 
successively  in  front  of  each  strip.  The  light  transmitted  represents 
the  operations  in  Eqs.  (2).  To  minimize  errors  due  to  phosphor 
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persistence,  about  0.2  p,sec  should  be  allowed  per  deflection. 

But  even  at  this  "slow"  speed,  we  obtain  5  Fourier  transform 
coefficients  per  microsecond,  and  the  one-dimensional  transform 
of  a  1000  X  1000  picture  requires  only  about  0.2  sec.  Parallel 
processing  for  Fourier  coefficients  eliminates  the.  need  for  any 
algorithms  of  the  Cooley-Tukey  type. 

(3)  As  far  as  filtering  is  concerned,  a  number  of  factors 
must  be  considered.  As  a  rule,  the  objective  is  to  try  a  number 
of  filters  with  the  same  picture.  Therefore,  in  the  arrangement 
described  for  convolution,  the  filter  stands  for  picture  B, 
ahead  of  picture  A  which  is  to  be  filtered.  The  parallel  output 
can  be  used  for  direct  viewing,  or  photographed  (with  all  the 
processing  delays  involved),  or  it  can  be  recorded  on  the  cathodo- 
chromic  screen  of  an  additional  image  converter  tube.  (This  also 
takes  time.)  For  all  other  uses,  a  sequential  output  is  essential. 

If  the  filter  is  available  in  sequential  form  on  video  tape, 
transcribed,  e.  g.,  from  the  output  of  a  digital  computer  or  of  a 
flying  spot  scanner,  it  is  presented  on  the  screen  of  a  cathode 
ray  tube  rather  than  of  an  image  converter  tube.  The  light  emitted 
by  B  and  transmitted  through  A  is  picked  up  by  a  photomultiplier 
separately  for  every  resolution  element  and  is  recorded  in  conven¬ 
tional  fashion. 


H(n,  m)  =  G(n,  m)S(m) 
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The  problem  is  to  find  h,  the  inverse  transform  of  H.  The  derivation 
shown  appears  to  be  self-explanatory 

-JL  v  ^  T,  2ffi/N  N 

G(n,  m)  =  ^  g(n,  p)t<r  W  =  e  W  =  1 

I?  1 


If  (jt  <  P> 


S  (m)W 


■m(p-|i) 


s(P“p.) 


if  P 


N_ 

y  S  (m)W 
OPl 


■m(p-p,) 


S(m)W 


-m  (N+p-p) 


s(N+p-p,)  since  N4p-p>0. 


N  -mp  -1  -mP 

h(n,  p)  -  H(n,  m)W  =  G(n,  m)S(m)W 

m=l  m=l 


T~  g(n,  n>53_  S(m)W 


m=l 


P-1 


N 


=1 


g(n,  p,)s(p-p,)  g(n,  p,)s (N+p-p,) 


H=P 


To  see  more  clearly  what  is  involved  in  implementing  this  approach 
we  write  a  few  of  the  coefficients  h(n,  0)  explicitly 
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h(n,  1)  =  g(n,  l)s(N)+s(n,  2)s (N-l)+g(n,  3)s (N-2)+ . .  .+g(n,  N)s(l) 


h(n,  2)  =  g(n,  l)s(l)+g(n,  2)s(N)+g(n,  3)s(N-l)+ - +g(n,  N)s(2) 

h(n,  3)  =  g(n,  l)s(2)+g(n,  2)s(l)4g(n,  3)s(N)+. .  .+g(n,  N)s(3) 

h(n,  N)  =  g(n,  l)s  (N-l)+g(n,  2)s(N-2)+g(n,  3)s(N-3)+. .  .+g(n,  N)s(N) 

As  the  index  p  changes  by  one  unit,  the  transformed  filter 
coefficients  s(m)  are  translated  cyclically  by  one  unit;  i.e.: 


s(N) 

s(N-l) 

...  s (3) 

s(2) 

3d) 

s(N-l) 

s (N-2) 

...  s (2) 

S(D 

s(N) 

s (N-2) 

s (N-3) 

...  s(l) 

s  (N) 

s(N-l) 

To  obtain  the  intensity  h(n,  1)  of  the  first  resolution  element 
on  line  n  of  the  filtered  picture,  we  present  the  first  sequence 
of  the  transformed  filter  coefficients  in  luminous  form  on  the  face 
of  a  cathode  ray  tube  and  shine  this  filter  function  through  line  n 
of  the  original  picture.  The  light  transmitted  through  all  the 
resolution  elements  is  picked  up  and  integrated  by  a  photomultiplier. 
The  operation  is  repeated  after  a  one-step  cyclic  translation  is 
performed  on  the  coefficients  s(m).  The  only  way  I  envision  of 
performing  this  cyclic  translation  is  to  store  the  coefficients  s(N), 
s(l)  on  a  scan  converter  tube  in  a  circular  scan  made  up  of  N  elements 
The  readout  also  follows  a  circular  pattern,  but  for  each  successive 
scan  line,  the  scanning  starts  at  the  same  element  where  the  preceding 


scan  terminated.  Writing  and  reading  rates  for  a  scan  converter 
tube  can  be  real  time  or  slower  so  that  there  is  no  problem  in 
recording  the  coefficients  s(ra)  as  they  are  obtained  by  a 
Fourier  transform  from  the  frequency  filter  S(m).  As  indicated 
above,  the  determination  of  each  coefficient  h(n,  p)  calls  for 
the  scanning  of  one  line  of  the  picture.  This  can  be  done  in  real 
time.  Thus  an  entire  line  of  a  transformed  picture  is  obtained 
per  TV  frame  time.  This  means  that  starting  with  a  1000X1000 
element  picture  and  a  line  frequency  filter,  we  obtain  a  filtered 
picture  in  little  over  half  a  minute. 

(4)  In  the  digital  processing  of  images,  masking  operation 
and  statistical  analysis  of  neighboring  areas  have  proved  very 
successful  in  extracting  information.  The  drawback  has  been  the 
length  of  time  required  to  perform  these  operations.  An  equivalent 
approach  has  been  tried  with  a  special  type  of  tube,  the  image 
storage  tube,  where  each  step  of  the  algorithm  is  performed 
sequentially  but  all  points  in  the  picture  are  handled  in  parallel. 
In  principle,  this  would  be  the  ideal  solution,  but  it  was  not 
made  to  work  in  practice  except  under  very  restrictive  conditions 
which  destroyed  the  usefulness  of  the  method. 

The  following  approach,  illustrated  on  a  very  simple  example, 
provides  a  means  for  convenient  parallel  handling  in  the  vicinity 
of  any  particular  point,  though  the  different  points  in  a  picture 
are  handled  sequentially.  Consider  a  resolution  element  symbolized 
by  a  point  (0,  0)  of  a  picture  and  assume  that  the  processing 
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affects  its  8  nearest  neighbors  as  well.  The  intensities  at  that 
point  and  its  vicinity  constitute  the  array  jg{.  In  the  type  of 
algorithm  considered,  the  intensity  at  every  point  of  the  area  is 
multiplied  by  a  preselected  number,  positive  or  negative,  forming 
the  array  {  b  |  . 


1-1 

810 

81  1 

b 

1-1 

bio 

bl  1 

0-1 

80  0 

60  1  !b! 

=  b 

0-1 

bo  0 

b0  1 

-1-1 

8-10 

8-l  1 

b 

-1-1 

b-l  0 

b 

-1 

The  products  are  added  together  to  form 
+1  +1 

J  =  ^ —  g  b 

p,— 1  Y  — 1  ^ 


which  represents  a  convolution  of  arrays  jg|  and  {bj.  jf  all  the 

numbers  of  array  |b]  are  positive,  it  can  be  used  to  form  a 
transparency  with  the  appropriate  intensity  values  and  apply  the 
techniques  described  above  for  convolution.  Otherwise,  every 
quantity  has  to  be  measured  from  a  bias  level,  which  makes  the  new 
array  positive.  An  appropriate  bias  function  has  then  to  be  subtracted 
from  the  final  result.  This  is  a  standard  operation. 


A  more  flexibile  approach  to  the  same  problem  and  one  that  does 
not  require  photographic  registry  of  the  algorithm  on  a  transparency 
calls  for  a  multibeam  cathode  ray  tube.  In  such  a  tube,  the  entire 
array  |b|  is  written  in  parallel  on  the  screen  and  is  then  scanned 
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across  the  entire  picture  to  be  processed.  Such  tubes  have  been 
built  by  Litton  Industries  and  by  Sylvania  and  (based  on  a  different 
principle)  are  under  consideration  by  the  Stanford  Research 
Institute.  Arrays  of  up  to  5  X  7  beams  have  been  fabricated,  but 
a  redesign  is  necessary  to  insure  that  the  array  consists  of  adjoining 
intensity  squares  rather  than  an  array  of  discrete  luminescent  spots. 
The  techniques  for  accomplishing  this  are  well  known. 

In  comparing  optoelectronic  and  purely  digital  image  processing, 
we  must  take  into  account  that  the  former  has  speed  on  its  side 
while  the  latter  one  has  high  precision.  But  while  only  minor 
improvements  in  the  components  could  increase  the  precision  of 
optoelectronic  processing  to  the  extent  that  the  system  is  no 
longer  the  limiting  factor,  it  requires  a  breakthrough  in  the  speed 
and  storage  capability  of  a  digital  computer  to  make  it  competitive 
in  these  areas.  If  cost  effectiveness  is  a  consideration,  we 
should  concentrate  on  the  implementation  of  optoelectronic  system 

designs . 
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The  potential  for  the  breadth  of  applicability  of  computer  graphics 
is  demonstrated  to  some  degree  by  the  variety  of  the  figures  which  appear 
in  this  paper.  Such  potential,  however,  cannot  be  realized  until  pictures 
and  drawings  can  routinely  be  entered  into  computers  and  manipulated 
there  with  a  minimum  of  human  effort  and  drudgery.  The  purpose  of  this 
paper  is  to  describe  a  working  process  which  contributes  a  step  toward 
this  goal. 

The  drawings  in  this  paper  were  all  automatically  digitized  and 
transduced  into  a  digital  computer  by  means  of  the  Visicon  AD-1  Automatic 
Digitizing  System  shown  in  Figure  1.  This  system  can  process  an  11"  x  17" 
document  in  58  seconds  at  a  resolution  of  100  samples  per  inch,  or  116 
seconds  at  200  samples  per  inch.  The  digital  output  usually  goes  to 
magnetic  tape,  but  can  alternatively  go  directly  into  a  computer.  The 
digitizer  itself  weighs  roughly  75  pounds  and  can  fit  on  a  desk  top. 

Figures  2  through  4  were  reproduced  directly  from  the  digitized  image 
onto  an  electrostatic  (raster)  plotter.  A  close  inspection  of  these  images 
will  reveal  that  lines  on  them  are  composed  of  individual  points.  Each 
image  is  actually  formed  from  a  mosiac  (raster)  of  black  and  white  dots 
which  are  represented  in  a  computer  by  means  of  ones  and  zeros.  The 
reasons  for  digitizing  such  drawings  are  manifold.  The  resultant  data  may 
be  transmitted  over  telephone  lines,  inserted  into  computer  files,  analyzed, 
or  may  even  serve  as  a  mechanism  for  controlling  machinery  or  computer 
software. 

The  EEG  of  Figure  2,  for  example,  was  digitized  so  that  a  computer 
could  make  a  frequency  analysis  for  medical  diagnoses.  Such  data  can 
also  be  imput  into  a  computer  with  a  manual  digitizer  or  a  data  tablet 
by  laboriously  retracing  the  lines  of  the  drawing.  The  drudgery  of  such 
retracing,  however,  results  in  poor  human  performance  when  attempted  over 
extended  periods  of  time.  The  work  is  monotonous  and  yet  requires  meticulous 
hand  and  eye  coordination.  One  can  imagine  the  difficulty  of  detecting  and 
removing  those  frequencies  introduced  into  EEG  data  during  the  tracing 
process  by  the  vagaries  of  the  human  muscular  motor  response. 


*This  research  was  funded  in  part  by  a  grant  from  VISICON,  Incorporated 
to  Small  Industries  Research,  The  Pennsylvania  State  University. 


The  printed  circuit  and  typed  lettering  shown  in  Figures  3  and  4 
indicate  the  quality  of  the  digitized  image.  Such  Information  can  be 
Inserted  into  computer  archival  storage  where  it  is  accessible  to  inter¬ 
active  updating,  computer  analysis,  or  optical  character  recognition. 

A  common  requirement  for  processing  such  digitized  images  is  the 
ability  to  abstract  line  information  from  them.  That  is,  the  rafiter  of 
disassociated  black  and  white  points  must  be  collated  into  sets  of  lines 
strokes.  As  digitization  can  generate  40,000  data  points  per  square  inch 
of  drawing,  such  line  isolation  and  identification  must  be  affected  by  a 
very  efficient  process.  The  graphic  collation  (on  an  IBM  360/65)  which 
yielded  the  digital  plots  shown  in  Figures  5  through  10  were  accomplished 
in  times  essentially  proportional  to  the  total  line  length  on  the  drawing 
and  independent  of  its  complexity. 

The  digitized  data  for  the  map  shown  in  Figure  5  required  23  seconds 
of  computer  time  to  collect  constituent  digitized  points  into  lines,  thin 
such  lines  to  individual  pen  strokes,  and  generate  plot  commands.  Figure 
6  required  183  seconds  of  computer  time  and  reveals  the  results  obtained 
by  processing  a  far  more  complex  drawing.  In  this  instance,  thin  lines 
were  reproduced  as  individual  strokes  and  thicker  symbols  were  represented 
by  their  peripheries. 

Figure  7  shows  the  results  obtained  by  digitizing  cardboard  cutouts 
for  garment  patterns  at  200  samples  per  inch.  The  computer  processing 
time  required  to  generate  the  plotting  commands  was  six  seconds  per 
pattern.  Such  plotting  commands  could  just  as  easily  have  been  used  to 
direct  a  cloth  or  metal  cutting  machine.  The  drawing  then  would  have 
served  as  the  direct  source  of  instructions  for  a  numerically  controlled 
tool. 


The  input  to  this  graphic  collator  is  the  raw  digitized  data.  The 
output  is  a  sequence  of  triples  (X,  Y,  I): 

X,Y  the  Cartesian  coordinates  in  inches  of  a  line 
point 

I  an  indicator  which  is  0  if  the  point  is  the  beginning 
of  a  line  segment  and  is  1  otherwise 

The  indicator,  I,  plays  the  role  of  the  "pen  code"  commonly  employed 
in  digital  plotter  commands.  This  set  of  triples  forms  a  complete 
geometric  description  of  the  lines  which  constitute  the  drawing.  The 
format  of  this  output  can  easily  be  modified  to  interface  with  a  large 
variety  of  digital  plotters,  computer  display  scopes,  and  numerically 
controlled  tools.  In  addition,  this  output  may  be  further  processed  by 
the  computer  to  yield  a  smoother  or  more  compact  analytic  representation 
of  the  data  for  transmission  or  storage  in  computer  files. 


. -■*- 
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In  many  applications,  computer  analysis  of  drawings  is  a  very 
important  requirement.  In  these  cases,  it  becomes  necessary  to  abstract 
and  categorize  the  important  features  of  each  picture.  Such  a  process 
is  qiiite  similar  in  concept  to  the  grammatical  parsing  of  sentences  into 
component  parts  of  speech.  In  essence,  a  drawing  is  reduced  to  its  com¬ 
ponent  line  segments  to  yield  a  network  description  of  the  original. 

The  structure  of  the  process  is  very  similar  to  that  employed  by 
programming  language  compilers  on  digital  computers.  The  initial  phase 
of  graphic  collation  has  been  described  and  involves  the  collection  of 
raw  data  points  into  component  lines.  Attendant  to  this  is  the  process 
of  graphical  lexical  analysis  in  which  lines  are  segmented  into  component 
parts,  and  the  interrelationship  of  separate  lines  is  detected  and  noted. 
Basically,  this  involves  isolating  and  measuring  points  where  lines 
terminate,  intersect,  or  exhibit  slope  discontinuities.  The  locations  of 
these  features  along  with  a  path  description  of  the  lines  between  them 
constitute  a  network  description  of  the  original  drawing  which  can  be  used 
to  regenerate  or  categorize  it. 


The  drawing  above,  for  example,  can  be  represented  by  the  line  terminal 
points  1,  3,  5,  and  6;  the  line  intersection  point  4;  and  the  line  slope 
discontinuity  point  2;  along  with  the  tabular  or  analytic  descriptions  f,g, 
and  h  which  categorize  the  curve  shapes  between  these  points.  The  output 
from  the  graphical  lexical  analyzer  for  this  drawing  can  be  represented  in 
tabular  form: 


Initial  Point 


Terminal  Point  Curve 


(X1,Y1)  (X2,Y2)  f 

(x2,y2)  (x4,y4)  g 
(x3,y3)  (x4,y4)  h 
(x4,y4)  (x5,y5)  h 
(x4,y4)  (x6,y6)  g 


This  ratwork  description  has  all  the  information  of  an  interconnection 
matrix  and  contains  the  geometric  constraints  of  the  system  as  well. 

This  description  can  thus  serve  to  reproduce  the  drawing  or  provide  input 
for  specialized  analysis  programs.  The  network  of  Figure  8  for  example, 
was  used  on  an  interactive  graphics  terminal  to  generate  an  electrical 
network  description  of  the  circuit  in  a  format  which  could  be  directly 
input  to  circuit  analysis  programs  for  computer  simulation  or  diagnostics. 
The. curves  in  this  instance  were  represented  by  polynomial  approximations. 


The  output  of  the  graphic  lexical  analyzer  can  also  serve  as  data 
for  a  graphic  syntax  analyzer.  Figures  9  and  10  exemplify  the  results  of 
such  syntactic  analyses.  The  lines  and  important  features  of  Figure  9 
were  detected  and  isolated  by  the  graphic  collator  and  graphic  lexical 
analyzer  to  produce  a  network  description  of  the  drawing.  This  data  was 
then  processed  by  a  graphic  syntactic  analyzer  specially  constructed  to 
recognize  circles,  squares,  and  rectangles.  The  results  shown  in  Figure 
10  are  lines  plotted  from  the  appropriate  words  to  the  proper  figures  on 
the  drawing.  This  analyzer  was  readily  constructed  as  it  had  to  work 
only  on  the  fixed  format  data  structures  of  the  network  rather  than  on 
the  imperfect  raw  data  itself.  The  system  works  on  rough  drawings  whose 
allowable  deviations  from  perfect  geometric  figures  is  specifiable  by 
parameters. 


In  summary,  the  automatic  digitizing  software  system  can  be 
categorized  as  consisting  of  three  distinct  sections  which  perform  the 
following  functions: 

1.  Graphic  Collator  -  collects  and  sorts  digitized  raster 

points  into  lines 

2.  Graphic  Lexical  Analyzer  -  translates  lines  into  a 

descriptive  network  of  nodal 
points  and  connecting  paths 

3.  Graphic  Syntactic  Analyzer  -  parses  and  recognizes 

graphic  symbols  using  the 
network  formed  by  the 
Graphic  Lexical  Analyzer 


Joseph  Novoshielski,  An  Interactive  Program  for  Automated  Network 
Description,  unpublished  Master’s  paper,  Computer  Science  Department, 
The  Pennsylvania  State  University,  University  Park,  Pa.  16802,  1972 
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The  system  described  here  is  capable  of  automatically  and  economically 
reducing  drawings  to  a  format  which  can  be  used  as  direct  input  to 
systems  for  computer  graphics,  numerically  controlled  tools,  data  trans¬ 
mission,  network  analysis,  pattern  recognition,  or  design  analysis.  The 
value  of  such  processing  is  that  the  accuracy  of  the  result  is  independent 
of  the  complexity  of  the  original  drawing  and  is  a  function  only  of  the 
resolution  of  the  digitizing  operation.  Humans  in  attempting  to  perform 
the  same  tasks  manually  will  inevitably  omit  some  data  and  interject 
inaccuracies  into  the  rest.  The  detection  and  correction  of  such  errors 
involves  as  much  work  as  does  the  original  digitizing. 


'HE  VISICOU  AD-1  AUTOMATIC  DiG-YlZlNG  SYSTEM 
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Printed  circuit  board  digitized  by  a  VISICON  AD-1  System  and 
reproduced  on  an  electrostatic  plotter. 
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Original  drawing  of  circles,  squares,  and  rectangles  shown  in  Figure  10 
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Drawing  digitised  by  a  YTCSICON  AD-1  Systea  and  replotted  on  a  digital  plotter  fror  data  generated  froa 
the  Graphio  Lexical  Analyser.  The  Graphic  Byniactic  Analyser  haa  located.  Identified,  and 
■ensured  the  circles,  rectangles,  and  equarea  in  \m  picture  (original  drawing  shown  in  figure  9) 


figure  10 


EVALUATION  OF  THE  ROOTS 
OF  CROSS-PRODUCT  BESSEL  FUNCTIONS 


Shih-Chi  Chu  &  Philip  D.  Benzkofer 
Research  Directorate 
Weapons  Laboratory  at  Rock  Island 
U.  S.  Army  Weapons  Command 
Rock  Island,  Illinois 


ABSTRACT 


A  difficulty  frequently  encountered  in  the  solving  of  ordinary 
and  partial  differential  equations  in  the  problems  of  heat  transfer, 
electricity,  fluid  and  solid  mechanics  involving  an  annular  region 
(such  as  a  gun  tube)  subjected  to  Dirichlet,  Neumann,  or  mixed  type 
boundary  conditions  is  that  of  obtaining  the  roots  of  nonlinear 
cross-product  Bessel  functions.  By  use  of  implicit  iterative  tech¬ 
niques,  a  digital  computer  program  was  prepared  by  personnel  in 
the  Research  Directorate  of  the  Weapons  Laboratory  at  Rock  Island 
to  overcome  this  difficulty  and  thus  to  provide  all  necessary  roots 
for  cross-product  Bessel  functions.  Tables  of  roots  of  certain 
particular  cross-product  Bessel  functions  with  varying  order  are 
given.  These  tables  are  particularly  useful  for  gun-tube  heat 
transfer  analysis  and  can  be  used  directly  by  a  designer  for  the 
calculation  of  transient  temperature  distribution,  and  thermal  stresses 
and  strains  in  a  small  or  large  caliber  gun  barrel. 


The  remainder  of  this  article  was  reproduced  photographically  from 
the  author’s  manuscript. 


Preceding  page  blank 
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1.  INTRODUCTION 

Cross-product  Bessel  functions  are  frequently  encountered 
in  solving  Bessel's  equation  in  the  problems  of  heat  transfer, 
electricity,  hydrodynamics,  and  mechanics.  Laslett  and 
Lewish1  studied  cross-product  Bessel  functions  of  the  types 

Jn(3a)Yn(Bb)  -  Jn(3b)Yn(3a)  =  0  (1) 

where  J,  Y„  are  Bessel  functions  of  the  first  and  second 
n  n 

kinds  of  order  n,  respectively,  and 

J^(3a)Y;(3b)  -  j;(3b)Y;(3a)  =  0  (2) 

where  J^,  Y^  are  derivatives  of  the  Bessel  functions  Jn,  Yn 
respectively,  with  respect  to  their  total  arguments. 

Kirkham2  graphed  various  combinations  of  Equations  1  and 
2  given  above.  Extensive  tables  of  roots  of  Equation  2  are 
given  by  Bridge  and  Angrist.3  Other  authors  have  solved  for 
the  roots  of  different  types  of  cross-product  Bessel 
functions.  A  general  form  of  cross-product  Bessel  functions 
that  could  be  reduced  to  many  specific  cases  would  obviously 
be  most  advantageous.  The  purpose  of  this  paper  is  to 
present  several  general  cross-product  Bessel  functions  that 
can  be  reduced  to  specific  cases  with  proper  selection  of 
constants  and  parameters. 

2.  EVALUATION  OF  THE  BESSEL  FUNCTIONS 

To  solve  the  various  cross-product  Bessel  functions,  one 

must  solve  the  individual  functions  and  Y„.  The  first 

n  n 

kind  of  Bessel  function,  J  (x),  is  evaluated  by  use  of  the 
recurrence  relation  4 

Fn+l(x)  +  Fn-l(x)  =  (2"/<*)f:n(x>  (3) 

Then  the  desired  Bessel  function  is  given  by 

Jn(x>  =  Fn(x)/a  (4) 


where 

a 


M-2 

Fo<x)  +  2mZ1F2rn^x^ 

m=l 


(5) 
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M  is  initialized  at  MQ,  where  MQ  is  the  greater  of  M^  and  Mg, 
where 


ma  * 


>  +  6]; 

.1 .4x  + 


x  <  5 
60/x];  x  >  5 


Mg  =  [n  +  x/4  +  2]  (7) 

FM-2’  FM-3’  ^2*  ^1  *  ^0  are  eva^uated  by  use  Equation 

3  with  Fm  =  0  and  Fm_-j  =  10"30.  Incrementing  M  by  3  and  again 

evaluating  Jn,  and  if 

Jn^M  "  ^n^M+3  —  5  I 


<  5  J„(x) 


n'  'M+3  | 


is  satisfied,  then  Jn  is  within  the  required  accuracy  range 

determined  by  a  given  6  >  0.  If  this  condition  is  not  met, 

M  is  again  incremented  by  3,  and  the  same  procedure  is  used 
until  the  desired  accuracy  is  obtained,  with  a  default  value 
for  M  given  by 

M  _  [20  +  1  Ox  -  x2/3] ;  x  <  1 5  /Q 


max  [90  +  x/2] ; 


x  >  15 


Similarly,  the  recurrence  relationship  for  Yn  is  given  by 

Yn  +  1(x)  =  (2n/x)Yn(x)  -  ( 

For  x  >  4,  the  Yg  and  Y-|  Bessel  functions  are  given  by  the 
asymptotic  relationships 

Yn(x)  =  (Pn(x)Sin(x-u/4)  +  Qn(x ) Cos (x-tt/4 ) )  ( 


(ID 


Y-i  (x ) 


{-P1  (x)Cos(x-tt/4)  +  Q-j  (x)Sin(x-Tr/4) 


where  Pg(x),  P-j  ( x ) ,  Qg(x)  and  Q-,(x)  are  defined,  but  are  too 
lengthy  to  include  in  this  discussion. 

For  x  <  4, 


Y„  (x)  ■  -=-  t  (-1  ),"<x/E)2m  r-U  [Log(x/2)  +  Y-H-]  (13) 


where 


m 

H  =  *  m  >  1 

m  r=  I  r  — 

0  ;  m  =  0 


(14) 


Y  =  Euler's  Constant  =  .  57721  56649 


V,(x)  ■-£  ♦£  E 

m  =  I 


C  Log(x/2)  *  V  H,  t 


(15) 


Then  the  Yn '  s  for  n  >1  can  be  obtained  from  Equation  10  for 
any  value  of  x.  The  two  subroutines  for  the  solving  of 
Bessel  functions  are  available  at  the  University  of  Iowa 
Computer  Center  in  their  subroutine  package;  both  subroutines 
were  entirely  workable  in  all  cases. 

3.  DISCUSSION  OF  THE  NUMERICAL  METHOD 

A.  NEWTON-RAPHSON  METHOD 

The  calculation  of  the  roots  of  cross-product  Bessel 
functions  was  accomplished  mainly  by  use  of  the  efficient 
Newton-Raphson  method.  The  iteration  technique  for  the 
Newton  Method  is  defined  by 

Xn+1  *  Xn  -  F<Xn>/F'<Xn>l  2-  "•  <16> 


The  graph  of  the  function  F  is  approximated  by  its  tangent 
at  the  point  Xp,  that  is,  F(X)  is  replaced  by 


F(xn )  +  ( X - X  n )  F * ( X  n ) 


(17) 


The  first  problem  encountered  with  this  method  was  that 
of  finding  a  good  first  estimate  for  X.  Since  the  shape  of 
the  cross-product  Bessel  functions  is  not  generally  known, 
this  first  estimate  is  difficult  to  make.  The  problem  Is 
avoided  by  the  checking  of  function  values  for  increasing  X 
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until  a  sign  change  in  the  function  ,  say  F,  is  detected.  At 
this  point,  evaluate  F' ,  the  first  derivative  of  F  with 
respect  to  its  total  argument,  and  then  evaluate  Equation 
16  for  a  new  X.  Then,  using  this  new  X,  evaluate  F  and  F'  , 
checking  the  value  of  F  to  determine  whether  this  value  Is 
within  the  desired  range  of  accuracy  for  a  root.  If  the 
accuracy  requirements  are  not  met,  determine  a  new  X  from 
Equation  16  and  repeat  the  process  given  above  until  the 
root  has  been  found.  When  the  first  root  has  been  determined, 
the  initial  X  for  finding  the  second  root  will  be  root(l) 
plus  an  arbitrary,  small  estimate  for  X.  The  identical 
procedure  as  followed  in  finding  the  first  root  is  used  to 
find  the  second  root.  After  the  first  two  roots  have  been 
obtained,  the  increment  size  for  X  is  set  at  X  =  root(2) 
-root(l),  since  the  roots  are  generally  periodic.  From 
this  point,  all  roots  past  the  first  two  are  found  by  use 
of  the  Newton-Raphson  method  only.  Thus,  the  change  of 
sign  test  is  no  longer  necessary.  The  convergence  is 
guaranteed  with  Newton's  method;  specifically,  the 
convergence  is  said  to  be  quadratically  convergent.  That 
is,  for  each  iteration  of  X,  the  number  of  correct  decimal 
places  in  the  root  are  doubled. 

B.  BISECTOR  METHOD 

If  the  slope  or  shape  of  the  cross-product  Bessel 
function  is  relatively  flat,  convergence  may  be  on  a  previous 
or  later  root.  Thus,  another  method  is  necessary  to  solve 
this  type  of  function.  For  lack  of  a  more  suitable  name, 
the  Bisector  method  is  chosen.  The  first  two  roots  can 
generally  be  found  by  the  Newton  method  since  the  sign 
change  is  generally  used  in  conjunction  with  it.  Therefore, 
suppose  that  the  first  two  roots  have  been  determined  and 
that  with  the  Newton  method  the  third  root  is  not  found. 

Then,  return  to  the  second  root,  reset  the  increment  size  of 
X  to  the  increment  used  to  find  the  first  two  roots,  and  add 
this  increment  to  the  second  root.  Thus,  the  first  estimate  to 
find  the  third  root  is  the  second  root  plus  the  small  increment 
of  X.  Then,  use  the  sign  change  technique  and  check  the 
function  value  F  with  increasing  X  for  a  sign  change.  When 
the  sign  change  is  detected,  decrement  X  by  one  increment 
of  X,  then  bisect  the  increment  size  of  X.  Now  proceed  again 
with  the  sign  change  technique.  When  the  next  change  is 
detected,  again  decrement  X  one  increment  of  X,  bisect  the 
increment  size,  and  proceed  with  the  sign  change  test. 

Repeated  iteration  of  this  process  will  give  us  the  desired 
roots.  This  method  was  quite  workable  in  all  cases  tested, 
the  only  disadvantage  of  this  method  was  that  the  speed  of 
convergence  is  slower  than  the  Newton  method. 
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Thus,  with  the  computer  program  developed  to  solve  for  the 
roots  of  cross-product  Bessel  functions,  any  number  of  roots 
can  be  found  for  virtually  any  type  of  cross-product  Bessel 
function.  The  specific  cases  can  then  be  fitted  into  the 
following  general  equation  forms: 

F  ( x )  =  [Cj  Jn  (bx)  +  ^x ( bx )  ] [C^ Y ^ ( a x )  +  C^x Y^-j  (ax )] 

-  [C5Jn(ax)  +  cgXJn+1 (ax)][C7Yn(bx)  +  CgXYn+1(bx)]  (16) 
and 

F ( x )  =  CC1 J* (bx)  +  C2xJ;+1(bx)][C3Y;(ax)  +  C4xY;+](ax)] 
-[C5j;(ax)  +  C6xj;+1(ax)][C7Y'(bx)  +  CgxY;+1(bx)]  (19) 

where  C-j ,  C2,  ....  Cg  are  constants  or  parameters,  and  Jn, 

Y  .  J'  and  Y'  are  as  defined  previously.  If  the  cross- 
n  n  n 

product  Bessel  function  cannot  be  fitted  into  the  general 
form  of  Equations  18  and  19,  one  could  simply  supply  a  new  F 
and  F 1 .  Several  functions  have  been  tabulated  to  illustrate 
how  one  can  determine  any  number  of  roots  for  any  order  n. 
Each  table  gives  but  only  partially  because  of  lack  of  space, 
the  magnitude  of  n  and  the  number  of  roots. 

4.  ROOTS  OF  SPECIFIC  CROSS-PRODUCT  BESSEL  FUNCTIONS 

Suppose  one  sets  C-|  =  C5  =  -1 ,  C3  a  C^  =  1  ,  and  C2  =  C^  = 

Cg  =  Cg  =  0  in  Equation  18.  One  obtains  identically  Equation 

1.  Since  this  specific  cross-product  Bessel  function  is 
commonly  found  in  various  fields,  the  roots  of  Equation  1 
are  tabulated  for  order  n  =  0  through  n  =  10  in  Table  I. 

Another  common  type  of  cross-product  Bessel  function 
that  has  identical  coefficients  as  assigned  above,  but 
applied  to  Equation  19  is  tabulated  in  Table  II. 

Thus,  with  the  use  of  the  computer  program  developed 
in  this  study,  one  can  find  any  number  of  roots  of  a  given 
cross-product  Bessel  function. 
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ABSTRACT.  This  paper  reports  the  results  of  a  study  to  determine  pattern 
vectors  (Profiles)  composed  of  physiological  and  biochemical  measurements 
which  reflect  the  severity  of  injury  to  traumatized  individuals.  Profiles, 
selected  by  clinicians  at  the  University  of  Maryland  Center  for  the  Stuiy 
of  Trauma,  were  obtained  from  the  Center  data  bank  and  subjected  to  pattern 
analyses  using  OLPARS,  an  on-line  pattern  analysis  and  recognition  system, 
located  at  Rome  Air  Development  Center.  Prognosis  regions  were  delineated 
in  the  Eigenvector  Plane  and  the  Discriminant  Plane.  The  time  courses  of 
individual  patients  were  plotted  in  the  Eigenvector  Plane. 

INTRODUCTION.  Shock  is  usually  associated  with  severe  injury  to  the  soft 
tissues,  the  skeleton,  and  to  specific  organs.  Tissue  injury,  hemorrhage, 
and  pain  cause  a  multidimensional  and  widespread  body  response  to  injury 
which  can  involve  every  organ  system  within  the  body.  Moreover,  the 
responses  are  interconnected  in  a  very  complex  way. 

The  Center  for  the  Study  of  Trauma  at  the  University  of  Maryland 
Hospital  was  established  to  study  the  effect  of  inadequate  tissue  perfusion 
induced  by  injury  at  the  organ  and  tissue  level  by  assessing  physiological 
and  biochemical  responses. 
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In  support  of  this  objective,  several  analyses  using  pattern  recog¬ 
nition  techniques  have  been  conducted  by  clinicians  and  researchers  from 
the  Center,  together  with  analysts  from  the  Biomedical  Laboratory  (Edge- 
wood  Arsenal,  Maryland)  and  the  Army  Materiel  Systems  Analysis  Agency 
(Aberdeen  Proving  Ground,  Maryland). 

A  data  bank  at  the  Center  contains  clinical,  cardiovascular,  metabolic, 
and  therapeutic  data  on  over  a  thousand  patients. 

The  initial  step  in  the  study  was  to  determine  pattern  vectors  composed 
of  physiological  and  biochemical  measurements  which  reflect  the  severity  of 
a  patient's  traumatic  state,  that  is,  to  determine  prognosis  regions  of  the 
pattern  space  and  to  analyze  the  time  course  of  patients  as  a  function  of 
therapy. 

A  candidate  pattern  profile,  selected  by  the  clinicians,  was  subjected 
to  pattern  analysis  routines,  using  OLPARS  (an  On-Line  Pattern  Recognition 
System  belonging  to  Rome  Air  Development  Center),  in  an  effort  to  find 
some  structure  in  the  data  and  to  delineate  various  prognosis  regions. 

The  pattern  profile  consisted  of  12  measurements,  so  in  this  instance,  the 
condition  of  each  patient  was  characterized  by  a  12-dimensional  vector 
X  =  (x^,X2» . . . ,x^^) .  From  the  data  bank  profiles  were  retrieved  on  140 

patients,  70  of  whom  ultimately  recovered,  and  70  of  whom  ultimately  died 
in  the  Center.  The  methods  and  results  of  the  analyses  will  now  be  described. 

METHODS 


Patient.  Sample  and  Data 

Included  in  this  study  were  initial  and  final  measurements  from 
140  patients,  70  of  whom  died,  and  70  of  whom  survived.  For  each  patient, 
the  first  measurement  on  each  variable  was  called  the  initial  value  and 
the  last  measurement  before  death  or  discharge,  the  final  value.  The 
set.  of  12  measurements  used  in  this  study*  were  systolic  blood  pressure 
(SBP),  diastolic  blood  pressure  (DBP),  hemoglobin  (Hgl),  hematocrit  (Hmt), 
serum  fibrinogen  (Fib),  serum  sodium  (Na),  serum  potassium  (K) ,  serum 
chloride  (Cl),  serum  osmolality  (Osm),  blood  urea  nitrogen  (BUN), 
glucose  (Gl),  and  serum  creatinine  (Cr), 

The  set  of  measurements  will  be  called  a  profile  or  a  patter  vector. 
Throughout  the  study  the  vectors  were  considered  as  belonging  to  classes 
A,  B,  L,  and  D  where: 

A  is  the  set  of  70  vectors  composed  of  final  measurements 
on  the  surviving  patients; 


*Results  of  a  study  of  organ  system  profiles  will  be  reported  in  another 
paper. 


54 


B  is  the  set 
on  the  dying  patients; 

L  is  the  set 
of  the  surviving  patients; 

D  is  the  set 
of  the  dying  patients. 


of  70  vectors 

of  70  vectors 
and 

of  70  vectors 


composed 

composed 

composed 


of  final  measurements 
of  initial  measurements 
of  initial  measurements 


Subsequent  to  the  start  of  the  study  nine  vectors  from  Class  A  were 
discarded.  These  vectors  came  from  patients  who  were  still  seriously  ill 
when  they  left  the  Center 


Analyses 

In  this  study  Class  A  was  considered  to  be  the  control  set.  Indeed, 
the  vectors  in  Class  A  are  the  final  measurements  of  patients  who  recovered 
and  should  be  close  (and  in  this  case  are)  to  normal  values.  The  means 
and  standard  deviations  for  the  measurements  in  Class  A  are  given  in  Table  1. 

All  of  the  data  were  normalized  with  respect  to  the  vectors  of  Class 

A.  More  specifically  let  the  pattern  vector  X  =  (x. , . . .  ,x.  .)  .  Let 
A  A  i  iz 

x^  and  be  the  mean  and  standard  deviation  of  x^  with  respect  to  Class 

A.  Then  each  X  vector  was  transformed  into  a  new  X  vector  whose  component 
new 

x.^  was  defined  by  the  relationship 


old  A 
x,  -  x.j 
new  i  1 

x  =  - 


The  normalized  mean  values  for  each  class  and  the  total  data  set  are 
given  in  Table  2. 


Profile  Analyses 

The  normalized  vectors  from  all  four  classes  were  subjected  to  a 
detailed  structure  analysis.  Several  mathematical  transformations  including 
Non-Linear  Mapping.  Eigenvector  Plane  Mapping,  and  Discriminant  Plane 
Mapping  were  used  in  an  effort  to  uncover  some  revealing  aspects  of  the 
inherent  data  structure.  For  this  data  set  the  latter  two  mappings 
appeared  to  provide  the  better  settings  for  delineating  prognosis  regions. 
The  Eigenvector  Plane  Mapping  projects  the  original  data  onto  a  plane 
which  "best"  fits  the  data  in  the  linear  least  squares  sense.  Minimizing 
the  sum  of  the  squared  distance  from  a  two-dimensional  subspace  of  the 
original  space  requires  the  solution  for  the  eigenvectors  of  the  lumped 
data  covariance  matrix.  The  eigenvector  plane  is  defined  by  the  two 
eigenvectors  E^,  corresponding  to  the  two  largest  eigenvalues  of  the 
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matrix.  The  Discriminant  Plane  Mapping  projects  two-class  data  onto  a 
plane  which  enhances  discrimination  between  data  classes.  The  plane  is 
defined  by  the  discriminant  vector  (which  is  the  direction  which 

maximizes  the  projected  between  class  scatter  relative  to  the  sum  of 
projected  within  class  scatter)  and  a  vector  D2,  where  D2is  that  direction 

orthogonal  to  which  maximizes  the  projected  between  class  scatter 

relative  to  the  sum  of  the  projected  within  class  scatter.  For  a  more 
complete  description  of  these  techniques  see  Reference  1.  The  eigen¬ 
vectors  and  E2  are  given  in  Table  3.  The  discriminant  vectors  for  the 

four  classes  taken  two  at  a  time  (namely,  A-B,  A-L,  A-D,  B-L,  B-D,  L-D) 
are  given  in  Table  4.  All  of  the  eigenvectors  and  discriminant  vectors 
are  unit  vectors.  Therefore,  the  mapping  of  a  vector  X  to  a  point 
^1*^2^  *n  tlie  Ei8envector  Plane,  say,  is  given  by  the  scalar  products 

y^  =  E  *X,  y2  =  E2*X.  The  Eigenvector  Plane  Plot  is  given  in  Figure  1. 

Figure  2  gives  a  count  of  the  number  of  vectors  from  each  class  which 
are  located  in  each  square  of  the  grid  appearing  in  Figure  1.  The 
Discriminant  Plane  Plots  for  pairs  A-B  and  A-D  are  given  in  Figures 
3  and  4. 


Measurement  Reduction 


In  solving  a  pattern  classification  problem  it  is  desitable  to 
use  the  minimum  number  of  measurements  to  achieve  a  satisfactory  solution. 
The  OLPARS  system  provides  two  functionals  (Discriminant  Measure  and 
Confusion  Probability)  for  ranking  the  measurements. 

The  functionals  were  used  to  evaluate  the  discriminatory  power  of 
each  individual  measurement.  The  measurements  which  ranked  high  with 
respect:  to  both  functionals  were  systolic  blood  pressure,  hematocrit, 
fibrinogen,  potassium,  osmolality,  and  creatinine.  The  original  profile 
was  reduced  to  these  six  measurements.  Structure  analyses  were  repeated 
on  the  reduced  vectors.  Little  loss  in  discriminating  among  various  data 
classes  was  noted  by  comparing  the  Discriminant  Plane  Plots.  For  example, 
Figure  5  is  the  Discriminant  variables.  The  reader  may  compare  this  plot 
with  Figure  3,  the  corresponding  plot  based  on  12  variables. 

Delineation  of  Good  and  Poor  Prognosis 
Regions  and  Patient  Trajectories 

Figure  6  gives  the  Eigenvector  Plane  Plot  for  the  six-dimensional 
profile.  This  plot  was  used  to  delineate  prognosis  regions,  with  good 
(8)  and  poor  (P)  regions  as  indicated.  Region  G  contains  46  vectors 
from  Class  A,  2  vectors  from  B,  31  vectors  from  L,  and  14  vectors  from 
D.  The  regions  labeled  with  a  P  contain  14  vectors  from  A,  68  from  B, 

39  from  L,  and  56  from  D. 
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The  Eigenvector  Plane  is  also  being  used  to  observe  the  time  courses 
of  patients.  Indeed,  let  X^,  be  a  time-ordered  sequence  of 

pattern  vectors  obtained  from  a  patient.  Let  Y^^ . t*ie  corres_ 

ponding  points  in  the  Eigenvector  Plane.  We  shall  call  this  latter 
sequence  a  trajectory.  From  the  trajectory  we  can  observe  a  patient's 
current  state  and  the  rate  of  change  of  the  state.  From  a  family  of 
trajectories  we  may  compute  transition  probabilities  from  region  to 
region  as  functions  of  therapy.  Figures  7  to  10  show  sample  trajectories 
of  daily  profiles  of  various  patients  some  who  survived  (designated  S) 
and  some  who  expired  (designated  D).  Each  trajectory  is  labeled 
1,2,.,.,  for  Day^,  Day2,...,  respectively. 

SUMMARY  AND  OBSERVATIONS.  A  pattern  profile  consisting  of  12  physio¬ 
logical  and  biochemical  measurements  was  used  to  reflect  the  severity 
of  a  patient's  traumatic  state.  Prognosis  regions  were  delineated  in 
the  Eigenvector  Plane  and  Discriminant  Plane  based  on  initial  and  final 
measurements  from  140  patients,  70  of  whom  ultimately  recovered  and  70 
of  whom  died  in  the  Center.  The  original  profile  was  reduced  to  6 
variables  with  little  apparent  alteration  of  prognosis  regions.  The 
Eigenvector  Plane  Plot  based  on  the  6-dimensional  profile  was  used  to 
exhibit  sample  trajectories  of  several  patients. 

The  Discriminant  Plane  may  be  used  as  well  for  plotting  trajectories. 
Indeed,  we  are  currently  converting  the  data  from  over  1,000  patients 
into  trajectories  in  the  Eigenvector  Plane  and  the  Discriminant  Plane. 

In  addition,  we  are  computing  trajectories  based  on  "distance"  from 
normality  using  original  variables  for  various  profiles.  8ome  of  this 
work  is  directed  toward  isolating  a  small  but  efficient  set  of  measure¬ 
ments  which  can  be  used  as  a  basis  for  initiating  and  evaluating 
therapies.  We  call  this  set  an  "Action  Profile."  Clinicians  are  polled 
to  determine  their  preferences.  Since  each  clinician  usually  has  his 
own  priorities,  we  would  like  to  establish  the  smallest  set  which  is 
sufficient  to  regulate,  say,  90%  of  patients  suffering  from  trauma. 

REFERENCES 


1.  J.  W.  Sammon,  "Interactive  pattern  analysis  and  classification," 
IEEE  Transactions  on  Computers,  Volume  C-19,  Number  7  (1960). 

2.  J.  Siegal,  R.  Goldwyn,  and  M.  Friedman,  "Pattern  and  process  in 
the  evolution  of  human  septic  shock,"  Surgery,  Volume  70,  Number 
2  (1971). 


57 


TABLE  1* 


Means  and  Standard  Deviations  for 
Measurements  from  Class  A 

\ 

\ 


MEAN 

.STANDARD 

DEVIATION 

Systolic  Blood  Pressure 

129.5 

16.3 

Diastolic  Blood  Pressure 

79.14 

12.7 

Hemoglobin 

12.24 

2.03 

Hematocrit 

35.87 

5.79 

Fibrinogen 

350.0 

139.0 

Sodium 

141.5 

6.37 

Potassium 

4.426 

0.910 

CMoride 

100.6 

8.85 

Osmolality 

302.2 

18.4 

Blood  Urea  Nitrogen 

22.7 

16.7 

Glucose 

128.7 

59.2 

Creatinine 


1.40 


1.38 


TABLE  2 


Normalized  Mean  Values 


C  LASS/ME  ASUREMEN’T 


Total  Data  Set 

Class  L 

Class  D 

Class  A 

Class  B 

SBP 

-1.609 

-1.301 

-1.423 

0 

-3.505 

DBP 

-.9356 

-.7051 

-.9724 

0 

-1.990 

Hgl 

-.1418 

.1684 

.06782 

0 

-.7850 

Hmt 

-.1332 

.2643 

.07497 

0 

-.8549 

Fib 

-.4457 

-.5592 

-.4726 

0 

-.6937 

Na 

-.1461 

-.1113 

-.3868 

0 

-.06754 

K 

.1707 

-.1953 

.09207 

0 

.7641 

Cl 

-.2870 

-.00326 

-.3836 

0 

-.7243 

Osm 

1.107 

.3724 

1.435 

0 

2.480 

BUN 

1.018 

.3448 

1.376 

0 

2.220 

G1 

1.041 

.9485 

1.372 

0 

1.708 

Cr 

1.062 

.4428 

1.175 

0 

2.495 
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TABLE  4  -  Discriminant  Vectors 


MEASUREMENT/ 
CLASS  PAIRS 

A- 

B 

A-L 

Di 

D2 

Di 

a2 

SBP 

.847 

.121 

-.364 

-.203 

DBP 

-.120 

.115 

.0396 

-.136 

Hgl 

-.0106 

.671 

-.154 

.671 

Hmt 

.198 

-.648 

.383 

-.568 

Fib 

.169 

.141 

-.443 

-.249 

Na 

.254 

-.0626 

-.00577 

-.0315 

K 

.0592 

-.146 

-.421 

-.0591 

Cl 

.115 

.102 

-.0883 

-.113 

Osm 

-.342 

-.0798 

-.0372 

-.0597 

BUN’ 

-.0221 

-.154 

.0280 

-.0140 

G1 

-.0372 

-.129 

.435 

.257 

Cr 

-.0231 

-.00698 

.34£> 

.136 

B- 

L 

B-D 

Di 

°2 

D1 

D2 

SBP 

.442 

.439 

.689 

.434 

DBP 

-.00652 

-.012S 

-.0835 

-.0497 

Hgl 

-.382 

.623 

-.151 

.761 

Hmt 

.587 

-.399 

.605 

-.430 

Fib 

.128 

.111 

.224 

.115 

Na 

.173 

.128 

.0733 

-.00767 

K 

-.193 

-.118 

-.222 

-.127 

Cl 

.333 

.256 

.135 

.0835 

Osm 

-.330 

-.353 

-.0266 

-.0592 

BUN 

-.0842 

-.139 

-.0608 

-.0646 

G1 

-.0627 

-.0892 

-.0111 

-.0285 

Cr 

-.000989 

-.000203 

-.0498 

- .0534 
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AERODYNAMIC  PARAMETER  IDENTIFICATION 
IN  BALLISTIC  RANGE  TESTS 

Gary  T.  Chapman 

Ames  Research  Center,  NASA 
Moffett  Field,  California 


INTRODUCTION.  In  general,  the  problem  of  aerodynamic  parameter  identifi¬ 
cation  has  as  its  goal  the  determination  of  certain  aerodynamic  forces 
and  moments  that  will  be  used  to  predict  the  dynamic  behavior  of  some 
full-scale  vehicle  or  projectile  as  it  flies  through  the  atmosphere. 

Here  then  the  problem  of  parameter  identification  must  not  only  be  con¬ 
cerned  with  the  problem  of  determining  the  aerodynamics  in  laboratory 
tests,  but  must  also  consider  how  those  results  can  be  applied  to  a 
geometrically  similar  but  larger  vehicle  flying  at  some  altitude  above 
sea  level.  The  purpose  of  this  paper  is  to  consider  a  "real"  world 
problem  and  trace  through  the  various  steps  that  may  be  required  to 
obtain  the  aerodynamic  data  needed  to  calculate  the  dynamic  behavior  of 
the  vehicle.  In  this  approach  I  will  concern  myself  with  both  the 
physical  world  and  a  mathematical  world,  A  simplified  box  diagram  of 
the  basic  process  that  will  be  discussed  is  shown  in  Figure  1.  The 
left-hand  series  of  boxes  represents  physics,  the  right-hand  boxes 
mathematics.  The  physical  world  problem  is  at  the  top  left,  with  its 
corresponding  mathematical  counterpart  on  the  top  right.  As  we  move 
down  the  chart  to  the  second  box,  we  simplify  the  problem  until  we 
arrive  at  a  satisfactory  experiment  and  corresponding  mathematical  model, 
that  is  depicted  by  the  third  box  down.  The  data  from  the  experiment 
are  combined  with  che  mathematical  model  in  what  is  classically  called 
the  parameter  identification  step.  From  this  step  we  get  not  only 
results  that  will  be  applied  to  the  real  problem  but  also  information 
on  experimental  errors  that  can  be  fed  back  into  the  experiment  to 
improve  the  data  and  information  about  the  accuracy  of  the  mathematical 
model  that  should  be  fed  back  to  improve  the  mathematical  model.  Note 
also  that  there  are  dotted  lines  connecting  the  physical  side  of  the 
chart  to  the  mathematical  side.  This  is  used  to  indicate  that  the  model 
selection  process  arises  from  the  joint  flow  of  information  back  and 
forth  between  laboratory  and  mathematical  considerations. 

In  the  following  sections  we  will  discuss  briefly  the  pressures  that 
cause  one  to  move  to  simpler  modeling  (both  mathematical  and  physical). 

We  will  then  discuss  in  detail  the  interacting  of  the  experiment  and 
mathematical  model  through  parameter  identification.  Where  possible, 
concrete  examples  of  all  the  steps  will  be  given.  In  any  one  case  all 
of  the  steps  are  not  followed;  hence  a  single  set  of  data  cannot  be 
followed  throughout,  but  rather  the  various  ideas  are  illustrated  by 
representative  data.  Much  of  the  information  we  will  describe  is 
contained  in  detail  in  Reference  1. 
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MODEL  SPACE  !  SPACE 


PROBLEM  DEFINITION.  The  physical  problem  we  will  try  to  find  answers  to 
will  be:  What  are  the  forces  and  moments  that  act  on  a  full  scale 
vehicle  flying  in  an  arbitrary  atmosphere?  One's  first  thought  might  be 
that  there  must  be  theoretical  prediction  techniques  available  to  compute 
the  aerodynamic  forces  and  moments,  and  there  are,  but  they  are  not 
sufficiently  good  at  this  time  to  risk  a  multimillion  dollar  vehicle  on 
the  results  from  them.  The  next  question  may  be:  Would  it  be  possible 
to  construct  a  theoretical  procedure  that  would  be  accurate  enough?  The 
answer  is  yes  in  principle,  but  we  lack  the  large-scale  computer  required 
to  make  the  calculation.  The  ILLIAC  IV  computer  presently  being  installed 
at  Ames  Research  Center  may  be  able  to  provide  such  results  but,  even 
there,  we  would  want  some  supporting  information  particularly  for  turbulent 
or  separated  flows.  Hence,  we  are  left  with  the  need  to  test,  but  not  the 
full-scale  vehicle  for  it  is  expensive.  Therefore  we  must  consider  smaller- 
scale  experiments.  This  step  away  from  the  real  physical  problem  takes  us 
into  what  will  be  referred  to  here  as  the  experimental  modeling  space  and 
correspondingly  the  mathematical  modeling  space. 


Modeling  Space 

We  referred  to  the  step  of  subscale  modeling  as  a  modeling  space 
because  there  still  remain  options  as  to  which  particular  model  will 
best  meet  the  needs  within  any  constraints  that  may  exist.  There  may  be 
no  unique  answer  to  the  question  of  which  modeling  method  is  best;  in 
fact,  some  of  the  techniques  may  be  complementary. 

The  first  step  one  might  consider  is  to  test  a  one-half  to  one- 
fourth  scale  model  of  the  vehicle  with  the  internal  systems  greatly 
simplified.  Even  here,  the  cost  is  prohibitive  in  all  but  a  very  few 
cases.  Next,  we  consider  ground-based  tests  such  as  in  a  wind  tunnel  or 
a  ballistic  range.  These  two  approaches  are  not  necessarily  exclusive. 
Availability  normally  dictates  which  will  be  used,  but  one  should  be 
aware  of  the  limitations  and  advantages  of  each  because  the  results  may  be 
altered  significantly  by  the  choice.  For  example,  in  a  wind  tunnel,  it 
is  easy  to  measure  force  and  moments  but  the  presence  of  a  model  support 
may  affect  the  data.  The  wind  tunnel  may  also  be  limited  in  its  ability 
to  simulate  the  proper  environment  at  very  high  speeds.  In  the  ballistic 
range,  on  the  other  hand,  force  and  moment  results  are  difficult  to  obtain; 
they  have  to  be  inferred  from  the  measured  motion  of  a  small  model  as  it 
flies  through  a  suitably  instrumented  range.  The  ballistic  range  test 
however  can  simulate  the  flight  conditions  better  and  there  are  no  support 
effects  to  worry  about.  Figure  2  shows  a  typical  ballistic  range  shadow¬ 
graph  from  which  the  motion  measurements  must  be  made.  For  the  remainder 
of  the  work  we  will  assume  that  the  ballistic  range  has  been  selected. 
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Dimensional  Analysis 


No  matter  which  technique  had  been  selected,  we  would  be  faced  with 
the  first  major  problem  that  parameter  identification  must  deal  with; 
that  is,  how  are  results  obtained  on  small-scale  models  in  a  laboratory 
facility  applicable  to  full-scale  flight?  Hence,  we  must  understand 
scaling  and  simulation  rules.  We  will  take  a  brief  look  at  this  from 
the  standpoint  of  dimensional  analysis. ^  To  be  specific,  we  will  look 
at  the  drag  of  our  vehicle  to  see  what  we  must  simulate  and  how  we  must 
treat  the  results.  The  basic  idea  of  dimensional  analysis  is  as  follows: 
In  the  functional  relationship  between  the  quantity  of  interest,  (drag, 
in  this  case)  to  other  important  variables  of  the  problem,  such  as  air 
density  p,  flight  velocity  U,  reference  length  L,  vehicle  configuration 
C,  and  vehicle  orientation  OR,  the  important  variables  must  appear  in 
combinations  such  as  to  yeild  the  same  dimensions  as  the  quantity  of 
interest. 

Equation  (1) 

Drag  =  D (p ,U,L;C,OR)  (1) 

shows  such  a  functional  relationship,  and  Equation  (2) 

c  =  - 2 -  =  C  (C,OR)  (2) 

u  2  2  u 

(l/2)pu  L 

shows  the  particular  grouping  of  p,  U,  and  L  that  yield  the  dimensions 
of  drag.  In  this  case  they  have  been  used  to  nondimensionalize  the 
drag  and  hence  produce  a  drag  coefficient  that  in  principle  depends  only 
on  the  remaining  two  nondimensional  quantities  C  and  OR,  (The  1/2 

appearing  in  the  denominator  arises  historically  because  the  quantity 
2 

(1/2)  pU  is  a  term  that  appears  in  simple  fluid  mechanic  problems  and  is 
referred  to  as  the  fynamic  pressure.)  The  dependence  shown  in  Equation 
(2)  is  found  not  to  be  sufficient  in  practice.  The  fluid  viscosity  (p) 
and  speed  of  sound  (a)  have  also  been  found  to  be  important.  When  these 
two  quantities  are  incorporated  we  get 

Drag  =  D(p,U,L,p,a;C,OR) .  (3) 

Since  we  have  already  established  a  group  of  terms  which  had  the  same 
units  as  the  drag,  the  remaining  variables  must  form  nondimensional 
groups.  When  we  do  this,  we  get  the  results  shown  in  Equation  (4), 
where  the  drag  coefficient  is  shown  to  be  a  function  cr  the  configu¬ 
ration,  the  orientation  and  two  parameters;  one  called  the  Reynolds  number 
Re,  and  the  other  the  Mach  number  M, 


CD - 2 - CD(M,Re;C,OR).  (4) 

(l/2)pU2  A 

Where  M  «  U/a  and  Re  =  pUL/p.  The  Reynolds  number  is  an  indication  of  the 
viscous  drag  effects,  and  the  Mach  number,  the  compressibility  effects 
brought  on  by  flying  at  speeds  other  than  those  much  less  than  the  speed 
of  sound.  Note  also  that  a  reference  area  A  has  been  used  in  place  of 
2 

L  .  This  last  functional  relationship  [Equation  (4)]  tells  us  that  as 
long  as  our  laboratory  test  is  of  a  model  with  the  same  configuration  and 
at  the  same  orientation  as  the  full-scale  vehicle  and  flying  at  the  same 
Mach  number  and  Reynolds  number,  then  the  drag  coefficients  for  the  two 
cases  are  the  same  and  we  can  use  laboratory  tests  to  establish  full- 
scale  results.  All  other  aerodynamic  forces  and  moments  must  be  treated 
in  a  similar  manner.  This  then  is  the  basis  for  going  from  laboratory 
tests  to  full  scale;  it  has  limitations  that  are  not  considered  here  but 
they  will  not  influence  our  discussion  (see  Ref.  2  for  more  details). 

It  might  seem  that  we  have  not  considered  the  mathematical  modeling  here 
but  it  has  been  implicit  that  both  experience  and  theory  are  used  to 
determine  what  variables  are  important. 


Experiment 

We  will  now  consider  the  ballistic  range  test  in  sufficient  detail 
to  obtain  a  better  understanding  of  the  physical  and  mathematical  models 
that  we  will  be  considering  in  the  classical  parameter  identification 
step  which  terminates  the  formal  job  of  obtaining  the  aerodynamic  para¬ 
meters  for  our  full-scale  vehicle. 

The  ballistic  range  consists  of  a  gun  from  which  a  small-scale 
model  of  the  vehicle  is  launched  into  an  instrument  range.  The  instru¬ 
mentation  consists  of  a  series  of  orthogonal  shadowgraph  stations  at  which 
spark  shadowgraphs  are  taken.  In  addition,  the  time  at  which  each  shadow¬ 
graph  is  taken  is  recorded.  An  example  of  a  shadowgraph  of  this  type  was 
shown  in  Figure  2.  From  the  shadowgraphs  one  obtains  the  position  and 
orientation  of  the  model  as  a  function  of  time  or  distance.  A  set  of  data 
from  an  actual  flight  is  shown  in  Figure  3(a),  (b),  (c).  The  distance  x 
is  along  the  range  and  the  distances  y  and  z  make  up  an  orthogonal  set; 
z  is  positive  downward  in  the  verticle  direction.  The  angle  of  yaw  is 
and  the  angle  of  pitch,  9.  The  roll  angle  $  is  not  shown  but  was  essen¬ 
tially  constant. 

Here  the  first  thing  ws  notice  is  that  we  are  dealing  with  a  problem 
involving  six  degrees  of  freedom.  Hence  the  mathematical  modeling  problem 
could  be  very  difficult.  Second,  although  it  is  not  obvious  at  this  point, 
there  is  experimental  error  present  in  the  data;  that  is,  the  data  points 
define  some  exact  trajectory  with  error  superimposed.  These  errors  should 
mostly  be  statistical  with  zero  mean  but,  as  we  will  see  later,  this  is  not 
always  so. 
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The  mathematical  modeling  of  this  six  degree  of  freedom  dynamic 
system  consists  of  two  sets  of  three  second-order  differential  equations, 
one  set  for  linear  momentum,  Equation  (5)  and  one  set  for  angular 
momentum,  Equation  (6) . 


d2x  _  F 
*^-Fx 


mi4=  F 
dt2  y 


d^z  _ 


(5a) 

(5b) 

(5c) 


V  -  qr(Iy 

V  "  Pr(1z 

V  -  pq(Ix 


.)  =  M„ 


V 


-  M 


m 


V 


=  M 


n 


(6a) 

(6b) 

(6c) 


where  F  ,  F  ,  and  F  represent  the  forces  (aerodynamic  and  gravity) 
x  y  z 

that  on  a  body  of  mass  m,  and  M^,  M^,  Mn  are  the  aerodynamic  moments 

that  act  on  the  body  about  principal  axes  that  have  moments  of  inertia 
Ix,  I  ,  and  Iz,  respectively.  Note  that  the  angular  velocities  p,  q, 

and  r  are  related  to  the  pitch,  yaw,  and  roll  angles  (see  Ref.  1). 


These  two  sets  of  equations  would  be  decoupled  except  for  the  aero¬ 
dynamic  forces  and  moments  that  occur.  It  has  been  found  from  inspection 
and  experience  with  data  that  the  coupling  in  some  directions  is  weak  or 
can  be  accounted  for  in  an  after-the-fact  manner.  This  weak  coupling  can 
and  should  be  exploited  in  setting  up  the  mathematical  models  for  para¬ 
meter  identification. 


The  equation  of  motion  in  the  x  direction  is  completely  decoupled 

from  the  remainder  of  the  equations  if  F  is  independent  of  orientation. 

x  3 

Even  if  it  does  depend  on  orientation,  Seiff  and  Wilkins  have  shown  that 
the  dependence  can  be  accounted  for  in  an  after-the-fact  manner.  This  will 
be  considered  in  detail  later.  Next  a  major  decoupling  is  brought  about 
by  transforming  the  remaining  equations  from  time  as  the  independent 
variable  to  distance  along  the  direction  of  flight  x.  When  this  has  been 
done,  the  equations  of  angular  momentum  depend  only  weakly  on  the  equations 
of  linear  momentum  but  the  coupling  in  the  opposite  direction  is  strong. 
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The  weak  coupling  can  be  accounted  for  in  an  iterative  manner  or  neglected 
entirely  in  many  cases.  Furthermore  within  the  set  of  angular  momentum 
equations  there  is  some  weak  coupling  that  can  be  exploited.  This  latter 
coupling  depends  on  the  set  of  angles  that  are  used  to  describe  the  motion. 

The  aerodynamic  forces  and  moments  are  unknown  functions  of  angular 
orientations  and  angular  rates.  This  is  the  information  that  we  are  try¬ 
ing  to  find.  It  is  also  the  area  where  much  modeling  needs  to  be  done. 

At  present  most  modeling  consists  of  polynomial  expressions  in  terms  of 
the  angles  and  angular  rates.  Two  approaches  have  been  used  here.  The 
4 

classical  approach  uses  conventional  concepts  of  angle  of  attack  and 
angle  of  sideslip.  The  other  uses  a  resultant  angle  approach. 5  The 
latter  Includes  nonlinear  terms  in  a  simpler  and  often  more  natural  way. 

PARAMETER  IDENTIFICATION.  We  now  take  up  the  classical  parameter  identifi¬ 
cation  step,  namely  the  bringing  together  of  experimental  data  and  u 
mathematical  model  to  deduce  some  aerodynamic  parameters.  In  this  step  we 
must  fit  our  equations  to  the  measured  motions  and  determine  the  initial 
conditions  and  aerodynamic  parameters  that  give  the  "best  fit".  "Best  fit" 
here  is  meant  in  the  least  square  sense.  Hence  our  starting  point  is  the 
sum  of  the  squares  of  the  difference  between  discrete  points  of  an  experi¬ 
mentally  determined  function  f  and  the  calculated  function  f  .  This  is 

ei  Ci 

written  as 

SSR  -  l  If  -  f  \  2  (7) 

i-1  |  *i  Cl) 

where  i  is  the  index  of  the  measuring  station  and  N  is  the  number  of 

these  stations.  We  will  now  consider  three  cases  of  increasing  complexity 

of  the  function  f  . 

c 
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Equation  (9)  cannot  be  used  for  f  in  Equation  (7)  to  give  a  normal 

Ci 

closed  form  least  squares  solution  bt cause  CQ  appears  in  a  trancendental 

manner.  Hence  we  must  consider  some  other  alternative  and  an  iterative 
application  of  the  conventional  procedure  will  suffice.  To  start  this 
we  need  a  first  approximation  for  Cp.  This  can  be  obtained  by  noting 

from  experience  that  KC^  is  normally  small  resulting  in  the  following 

approximation  for  Equation  (9) . 


t 


t  +  T7- 

O  V 


(10) 


Since  this  is  a  polynomial  in  x,  the  straightforward  least  squares 
procedure  readily  leads  to  a  value  of  C^.  This  value  is  normally  within 

a  few  percent  of  the  value  obtained  using  Equation  (9).  To  obtain  the 
value  of  Cp  consistent  with  Equation  (9),  this  approximate  value  of 

is  substituted  into  Equation  (9)  and  the  standard  least  square  procedure 
is  used  to  determine  the  leading  constant  and  the  multiplier  on  the 
exponential  (these  appear  in  a  linear  manner).  The  value  of  SSR  is  then 
calculated;  Cp  is  now  changed  by  a  small  amount  and  the  process  repeated. 

This  entire  procedure  is  repeated,  always  looking  for  the  value  of  CD 

that  yields  the  minimum  value  of  SSR.  This  search  can  be  carried  out  in 
many  ways.  A  simple  and  fast  way  is  to  repeat  the  process  for  three 
values  of  C^,  fit  a  parabola  to  the  values  of  CQ  and  SSR  and  determine  a 

minimum.  This  value  of  can  be  the  starting  place  to  repeat  the  process 

with  smaller  changes  in  C^,  if  more  accuracy  is  required. 


Variable  Drag  Coefficient 

In  most  cases  of  interest  drag  depends  on  the  angular  orientation. 
Seiff  and  Wilkins^  have  shown  that  for  axis-symmetric  bodies  when  the 
drag  coefficient  depends  on  the  resultant  angle  of  attack,  a,  as 


C 


D 


2 

a 


(ID 


the  drag  coefficient  determined  as  though  it  were  constant  is  the  correct 
value  at  the  root  mean  square  angle-of-attack  for  that  flight.  If  this 
is  so,  values  of  determined  assuming  constant  when  plotted  as  a 

function  of  the  mean  squared  angle  of  attack  should  result  in  a  straight 
line  whose  intercept  is  and  whose  slope  is  .  Two  examples  of  such 

o  2 


data  are  shown 
straight  line, 
agreement  with 


in  Figure  4.  Here  we  see  that  we  do  indeed  obtain  a 
Note  also  that  in  this  case  the  intercept  is  in  good 
theoretical  values. 


Data  Comparison 

To  show  that  data  obtained  in  ground  facilities  are  applicable  to 
a  full-scale  flight  we  have  made  a  comparison  of  wind  tunnel,  ballistic 
range,  and  full-scale  data.  The  full-scale  flight  was  for  a  55° 
blunted  cone  approximately  one  meter  in  diameter. ^  Onboard  accelerometer 
measurements  were  combined  with  meteorological  measurements  of  density 
<-o  determine  the  drag  coefficients.  The  ballistic  range  data  were  ob¬ 
tained  by  Robert  Sammonds  and  Robert  Kruse  of  Ames  Research  Center.  The 
wind  tunnel  results  are  complied  for  several  sources.  Comparison  of 
these  data  are  shown  in  Figure  5.  Note  C^  is  plotted  versus  Mach  number 

but  since  the  full-scale  flight  experienced  a  particular  Reynolds  number 
and  angle  of  attack  at  each  Mach  number,  the  wind  tunnel  and  ballistic 
range  results  are  for  the  appropriate  Reynolds  number  and  angle  of  attack. 
It  can  be  seen  that  while  the  agreement  is  not  perfect,  it  is  within 
2-4%  for  most  of  the  range  of  conditions.  This  is  well  within  the 
accuracy  of  the  full-scale  measurements. 


Transcendental  or  Nonlinear  Case  -  Linear  Stability 

The  next  most  complicated  parameter  identification  problem  arises 
when  some  of  the  parameters  occur  in  a  transcendental  or  nonlinear 
manner.  An  important  example  of  this  in  ballistic  range  testing  is  the 
solution  of  the  equations  of  angular  momentum  when  linear  aerodynamics 
and  constant  roll  rate  have  been  assumed  for  an  axisymmetric  body  with 
small  asymmetries.  The  equation  of  motion  for  that  case  is. 

+  A£ '  +  B£  =  CeipX  (12) 

where  £  =  8  +  ia,  B  is  the  angle  of  sideslip,  and  a  is  the  angle  of 
attack.  The  solution  to  Equation  (12)  is 


(n  +  iu  )x  (q„  -  ico_)x 

C  =  Kie  1  1  +  K2e  2  2  +  K3eipX 


(13) 


which  is  called  the  tricyclic  equation  because  it  is  equivalent  to  three 
rotating  vectors  in  the  a  -  B  plane.  This  equation  was  first  derived  by 
9 

Nicholaides.  Here  the  n's  and  w's  are  related  to  the  important  aero¬ 
dynamic  parameters,  damping  and  pitching  moments,  respectively.  Equation 
(13)  can  be  written  in  terms  of  components  as 
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tolerance.  The  starting  solution  for  this  case  is  obtained  using  the 

Prony  method. ^  It  will  be  discussed  briefly  in  a  later  section  involving 
starting  solutions.  Some  examples  of  fits  to  experimental  a, 6  data 
using  Equation  (13).  Some  values  of  the  aerodynamic  pitching  moment^ 
curve  slope,  c  ,  obtained  from  similar  fits,  are  shown  in  Figure  7. 
a 

Note  C  is  related  to  w,  and  u><,  as 
m  Li 

a 


(17) 


We  see  in  Figure  7  that  C  appears  to  be  nearly  constant  (i.e.,  C  is 

a 

linear)  for  small  angles  of  attack  and  hence  the  use  of  Equation  (13) 
is  justified  but,  as  the  angles  become  large,  C  starts  to  decrease 

a 

and  our  assumption  of  linear  aerodynamics  is  no  longer  valid.  This  can 

13 

be  handled  in  a  quasilinear  manner,  as  developed  by  Murphy  and  Rasmussen 
14 

and  Kirk.  These  methods  relate  the  values  of  C  determined  from  a 

m 

a 

linear  analysis  using  Equation  (13)  for  several  flights  at  different 
amplitudes  to  the  best  nonlinear  polynomial  representations  of  pitching 
moment  C^.  When  the  procedure  of  Reference  14  is  applied  to  the  data 

for  Mach  number  11.5  of  Figure  (7),  we  get  the  results  shown  in  Figure  (8). 


The  method  of  quasilinear  analysis  does  not  handle  nonlinear  damping 
very  well  and  entails  some  other  approximations  that  prevent  it  from 
having  complete  generality.  Hence,  we  are  led  to  the  final  and  most 
general  case  of  parameter  identification. 


Differential  Equation  Case  -  Nonlinear  Stability 

The  most  general  case  of  parameter  identification  occurs  when  the 
mathematical  model  is  of  sufficient  complexity  to  prohibit  the  possibility 
of  finding  a  closed-form  solution.  To  illustrate  how  a  solution  for  this 
problem  proceeds  we  will  consider  the  planar  motion  of  a  vehicle  governed 
by  nonlinear  static  and  dynamic  aerodynamic  moments.  The  equation  for 
this  case  is 

S  +  (ci  +  C2a2)  a  +  (C3  +  C^a2)a  =  0  (18) 

with  the  initial  conditions 
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a(o)  =  C5 
a(o)  =  Cg. 

To  apply  least  squares  to  this  we  first  use  the  method  of  differen¬ 
tial  corrections  (Equation  (16)).  Note  f£  is  replaced  by  a  and  there 

are  only  six  unknowns  to  consider.  What  is  needed  is  an  initial  solution 
a  if  six  approximate  values  of  the  C's  are  known  by  straightforward 
o 

integration  (a  Runge-Kutta  integration  procedure  works  well).  To  obtain 
the  derivatives  of  a  with  respect  to  the  C's  we  apply  parametric 

15  ° 

differentiation. 

PARAMETRIC  DIFFERENTIATION.  Parametric  differentiation  starts  by  differ¬ 
entiation  of  the  equation  of  interest  (  Equation  (18))  with  respect  to 
each  of  the  parameters  to  be  determined^!®  For  example,  for  C^,  defining 

=  9a/3C^  we  obtain 

G3  +  (C  +  C2ct2)G3  +  2C2<xaG3  +  (C3  +  3C4a2)  G3  =  -a  (19) 

with  initial  conditions 


G3(o)  =  0 
G3(o)  =  0. 

There  are  six  of  these  equations.  Using  appropriate  values  for  the  C's  we 
must  integrate  simultaneously  the  six  equations  for  the  G^'s  and  the  one 

for  a  with  appropriate  initial  conditions.  We  now  have  all  of  the  infor¬ 
mation  to  proceed  with  out  least  square  by  differential  correction 
procedure. 

BeforA  proceeding  with  some  results  we  will  discuss  an  important 
factor  that  affects  how  the  above  procedure  is  applied. 

INFORMATION  CONTENT.  In  any  single  test  in  a  ballistic  range  where  the  aero¬ 
dynamics  may  be  nonlinear  there  is  normally  not  enough  data  (information) 
to  determine,  with  any  degree  of  confidence,  the  individual  nonlinear  terms. 
This  will  be  illustrated  with  the  wave  form  of  a  planar  oscillation.  In 
Figure  9  we  have  plotted  the  wave  form  of  the  pitching  motion  for  two  cases 
having  the  same  wave  length  and  with  the  amplitude  normalized  out.  The 
upper  curve  is  for  linear  aerodynamic  moment  and  is  a  sine  wave.  The 
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Fig.  9.  Effect  of  nonlinear  pitching  moment  on  waveform  of  oscillation 


other  curve  is  for  a  pure  cubic  pitching  moment.  Note  that  there  is  not 
much  difference  between  the  two,  particularly  when  only  a  few  discrete 
data  points  on  the  curve  are  considered  and  these  contain  experimental 
error.  All  two-term  linear-cubic  pitching  moments  where  both  terms  are 
stabilizing  produce  wave  forms  that  fall  between  these  two.  Hence  we 
see  that  the  only  important  way  of  detecting  nonlinearity  when  the  pitch¬ 
ing  moment  is  nonlinear  is  that  the  wave  length  is  a  function  of  amplitude. 
In  many  ballistic  range  tests  however  a  model  may  be  only  lightly  damped 
(small  change  in  amplitude)  and/or  only  a  few  wave  lengths  of  motion  are 
observed.  Therefore  the  only  way  to  discover  the  nonlinearity  is  to 
reduce  simultaneously  data  from  several  flights  with  different  amplitudes. 
In  doing  this,  the  aerodynamics  (coefficients  -  C^)  are  assumed  to  be 

constant  from  flight  to  flight  but  the  initial  conditions  (C^  and  C^) 

are  different.  Hence,  for  example,  in  fitting  four  tests  simultaneously, 
we  must  solve  for  twelve  unknowns  -  four  aerodynamic  parameters  and  eight 
initial  conditions.  The  fit  of  Equation  (18)  to  four  ballistic  range 
tests  of  a  Gemini  capsule  is  shown  in  Figure  10.  The  static  pitching 
moment  curves  deduced  from  these  flights  are  shown  in  Figure  11.  Shown 
for  comparison  is  the  result  deduced  using  the  quasilinear  approach. 

The  flight  difference  can  possibly  be  attributed  to  the  inclusion  of  a 
nonlinear  damping.  Also  included  to  show  the  sensitivity  is  a  curve 
generated  with  only  three  runs;  one  of  the  small  angle  runs  was  deleted. 


Comparison  of  C 

a 

To  complete  the  discussion  of  stability  we  will  again  make  a  compari¬ 
son  of  wind  tunnel,  ballistic  range,  and  full-scale  flight.  The  date  are 
from  the  same  sources  as  the  drag  comparison  (Figure  5).  The  comparison 
shown  in  Figure  12  is  seen  to  be  very  good  for  most  of  the  Mach  number 
range.  Again  we  see  that  the  data  from  the  small-scale  tests  can  be  used 
with  confidence  in  a  full-scale  test. 

Additional  Cases 

Two  additional  cases  that  have  employed  parametric  differentation 
will  be  considered  briefly  to  illustrate  the  versatility  of  the  method. 

The  first  involves  the  data  reduction  for  a  ballistic  range  test  of  the 
X-15  airplane.  In  this  case  only  linear  aerodynamics  have  been  used  but 
because  the  body  is  not  axially  symmetric,  there  are  many  more  unknown 
aerodynamic  parameters.  A  curve  fit  of  the  a  and  8  data  using  Equations 
7,97  and  7.98  of  Reference  1  is  shown  in  Figure  13.  Note  the  odd  behavior 
that  is  exhibited  by  the  X-15  model  and  yet  the  curve  fit  is  very  good. 
Aerodynamic  data  obtained  from  the  test  agree  well  with  flight  test 
results.  When  this  particular  test  was  made  in  the  late  1950's,  it  could 
be  reduced  with  more  approximate  techniques. 
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Static  stability  comparison. 


DATA 


Curve 


A  second  example  is  that  of  an  axisymmetric  body  trimmed  near  90° 
angle  of  attack.  For  this  case  the  equation  for  the  resultant  angle 
assuming  no  swerve  (Equation  &. 46  of  Reference  1)  was  used  with  a  non¬ 
linear  pitching  moment  and  a  constant  damping  parameter.  A  curve  fit 
of  the  data  from  this  test  is  shown  in  Figure  14.  Here  again  we  see  a 
good  fit  to  the  data.  An  important  point  to  be  made  here  is  that  the 
choice  of  equation  for  the  fitting  procedure  can  simplify  the  problem 
greatly. 


Associated  Topics 

There  are  several  topics  associated  with  parameter  identification 
that  have  been  passed  over  or  only  touched  on  briefly.  We  will  now  con¬ 
sider  five  of  these;  starting  solutions,  convergence  of  iteration  procedure, 
modeling  of  forces  and  moments,  experimental  errors,  and  sensitivity  of 
results  and  experiment  design. 

STARTING  SOLUTIONS.  The  problem  of  obtaining  a  good  starting  solution  for 
the  differential  correction  procedure  can  be  very  important  to  obtaining 
the  correct  converged  solution  (i.e.,  there  may  be  multiple  minimums  in  a 
nonlinear  multiparameter  problem).  There  are  probably  as  many  ways  to 
get  starting  solutions  as  there  are  problems.  We  will  briefly  mention  a 
few  that  have  been  found  to  be  useful  in  ballistic  work.  The  first 
approach  would  be  to  look  for  a  linear  solution.  This  was  done  in  the 
case  of  drag  for  small  values  of  KC^.  The  next  case  arises  when  the 

solution  is  written  as  a  sum  of  exponentials  as  in  the  linear  stability 

case.  Here  the  Prony  procedure  ^  can  be  applied.  The  basic  approach 
here  is  that  for  equally  spaced  data  points  a  recursion  formula  can  be 
written  at  a  prior  data  point.  This  procedure  leads  to  a  linear  set  of 
equations  that  is  readily  solved.  If  the  data  are  not  equally  spaced, 
as  is  the  normal  case,  a  set  of  equally  spaced  data  can  be  constructed 
from  the  original  set  either  by  hand  oi  machine-fairing  of  the  data. 

For  cases  where  parametric  differentiation  is  employed  one  can  use 
quasilinear  analysis  of  Reference  13  or  Reference  14  to  determine  start¬ 
ing  values  of  the  static  stability  parameters,  or  one  can  formally  inte¬ 
grate  the  equations  and  obtain  implicit  integral  equations  for  a  and  0. 

If  the  integrals  in  these  equations  are  determined  by  fairing  the  data 
and  performing  the  indicated  integration  graphically,  a  set  of  linear 
equations  can  be  constructed  to  evaluate  the  unknown  parameters.  This 
method  was  originally  used  in  Reference  17  and  is  described  in  Reference 
16. 


Other  sources  of  starting  solutions  are  existing  data,  theoretical 
determination  of  aerodynamic  parameters,  and  finally,  probably  the 
most  important,  the  experience  of  the  analyst. 

CONVERGENCE  AND  STABILITY.  In  the  iterative  procedures  discussed  one 
normally  has  to  prescribe  some  realistic  criterion  by  which  to  judge  con- 
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vergence.  One  can  either  require  that  the  change  in  the  sum  of  the 
squares  of  the  residuals  is  less  than  some  amount  or  that  the  changes 
in  the  parameters  being  sought  are  smaller  than  some  prescribed  values. 
The  latter  is  probably  a  better  criterion  but  the  prior  requires  fewer 
convergence  tests  per  iteration  and  is  normally  used.  In  either  case 
one  has  to  be  careful  that  the  convergence  criteria  are  not  so  strict  as 
to  be  within  the  calculation  error  bound  (round  off,  truncation,  etc.). 


In  some  cases  the  iterative  procedure  is  unstable.  This  instabil¬ 
ity  normally  arises  because  some  corrections  obtained  in  the  differential 
correction  procedure  are  so  large  as  to  violate  the  two-term  Taylor 
Series  expansion.  An  obvious  way  to  prevent  this  from  happening  is  to 
retard  the  corrections  in  a  systematic  way  when  necessary.  This  can  be 
accomplished  with  what  is  referred  to  here  as  the  "Marquardt  Algorithm." 
It  starts  with  the  following  alternative  equation  for  the  sum  of  the 
square  of  the  residuals. 


(20) 


where  R  is  the  number  of  unknowns  and  X  is  an  arbitrary  positive  constant, 
usually  small.  The  first  term  in  brackets  is  the  conventional  least 
squares  by  differential  correction  term.  When  SSR  is  minimized  with  just 
this  term,  the  Aa^'s  are  adjusted  to  obtain  the  minimum  difference  between 

experiment  and  calculation.  The  additional  terra  represents  the  change  from 
the  initial  solution  to  the  new  calculated  solution.  If  we  add  a  small 
amount  of  this  term  (determined  by  adjusting  X)  to  be  minimized  at  the 
same  time  as  the  first  term,  we  in  effect  slow  down  the  rate  at  which  the 
Aa^'s  change  and  hence  keep  our  solution  within  the  range  of  validity  of 

the  two-term  Taylor  Series  expansion.  The  value  of  X  required  for  most 

-3 

cases  is  on  the  order  of  10  .  However  values  as  large  as  order  10  have 
sometimes  been  required.  It  is  important  to  note  that  the  choice  of  X 
will  have  no  effect  on  the  converged  solution  since  it  affects  only  inter¬ 
mediate  steps  (i.e,,  the  path  and  rate  of  convergence  are  altered  but  not 
the  end  point).  This  latter  is  not  strictly  true  if  the  new  path  would  by 
chance  take  the  solution  to  the  neighborhood  of  some  other  nearby  minimum. 
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The  resulting  matrix  equation  for  the  least  squares  solution  of 
Equation  (20)  can  be  written  as 


AAa  =  r 


Aa  -  A-1r 


(1  +  A)A’  and  A 


Hence  the  A’ . .  and  the 


where  A..  ■  (1  +  A)A' . .  and  A..  =  A'  ,  1  +  k.  Hence  the  A*  and  the 
jj  jj  jk  jk'  J  jj 

A*  are  the  matrix  elements  when  only  the  first  term  of  Equation  (20) 

J  K 

is  retained  (conventional  least  squares  with  differential  correction). 
Note  the  simple  way  the  "Marquardt  Algorithm"  is  added  to  the  problem; 
the  diagonal  elements  are  multiplied  by  (1  +  X) .  Additional  coding  is 
required  to  determine  when  and  how  large  X  should  be.  One  procedure  is 
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to  use  it  only  when  SSR  Increases  and  then  start  with  X  ■  1  x  10  ;  if 
this  falls,  square  (1  +  X);  if  this  fails  again,  cube  (1  +  X)  and  so 
on.  When  the  convergence  is  reestablished,  return  X  to  0  and  proceed 
until  it  is  required  again. 


MODELING  FORCES  AND  MOMENTS.  It  was  noted  earlier  that  modeling  of  forces 
and  moments  was  an  important  area  of  work  to  be  done  and  that  normally 
polynomial  representations  were  used.  The  parametric  differentiation 
approach  to  the  problem  is  much  more  general;  for  example,  the  moment 
could  be  composed  of  a  series  of  piecewise  linear  segments.  The  possib¬ 
ilities  are  many  and  I  will  not  pursue  the  point  further.  What  I  will 
illustrate  briefly  is  the  sensitivity  to  the  form  of  the  expression. 

This  will  be  illustrated  by  a  set  of  results  that  was  obtained  using  the 
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quasilinear  method.  *  (Similar  results  would  be  expected  to  apply 
to  the  parametric  differentiation  approach.)  In  that  study  Kirk  and 
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Miller  used  a  four-term  polynomial;  the  first  term  was  always  the 
linear  term  in  angle  of  attack,  the  remaining  three  terms  were  various 
combinations  of  powers  from  2  to  7.  This  resulted  in  20  possible 
polynomials.  The  best  five  polynomials  are  shown  in  Figure  15.  All 
other  polynomials  produced  fits  that  had  considerable  larger  residuals. 

It  is  not  really  necessary  to  choose  a  best  representation  from  these 
five  polynomials  as  they  give  nearly  identical  results  within  the  ampli¬ 
tude  range  of  the  experiment. 

EXPERIMENTAL  ERROR.  Errors  always  exist  in  experimental  data.  Most  of 
them  are  random,  but  they  often  may  be  correlated  because  of  some  cali¬ 
bration  errors  or  facility  problem.  The  latter  can  and  should  be  elimi¬ 
nated  and  the  magnitudes  of  random  ones  known  at  least  insofar  as  standard 
deviations  are  concerned.  A  detailed  evaluation  of  the  errors  is  possible 
on  a  continued  basis  when  data  are  being  reduced  in  large  quantities. 


ENVELOPE  FOR  1-3-4-5. 
1-3-4-6,  1 -3-4-7, 
1-3-5-6,  AND  1 -3-5-7 


15.  Sensitivity  of  pitching  moment  modeling. 


This  is  accomplished  by  saving  the  residuals  from  each  curve  fit  and 
periodically  examining  them.  The  residuals  obtained  from  a  series  of 
tests  in  the  Ames  hypervelocity  Free  Flight  Aerodynamic  Facility  are 
shown  plotted  in  Figure  16.  ^  The  number  of  times  an  error  fell  between 

to. 00013  cm  about  a  value  of  Az  is  indicated  by  the  number  of  dots 
plotted  at  that  level.  This  is  done  for  each  station.  Curves  are  faired 
over  each  of  these  distributions.  We  can  see  that  all  of  the  stations 
appear  to  have  similar  distributions  but  few  are  centered  at  zero.  This 
bias  is  thought  to  be  associated  with  the  facility.  The  test  section  is 
made  in  sections  each  composed  of  four  or  five  windows.  If  each  section 
were  twisted  relative  to  the  other,  a  slightly  different  optical  distor¬ 
tion  would  occur  in  each  window.  These  nonzero  means  can  be  taken  out  of 
the  experimental  data  by  a  calibration  constant.  A  periodic  facility 
calibration  can  be  obtained  if  a  heavy  sphere  is  launched  at  very  low 
pressure  in  the  facility;  its  trajectory,  which  is  a  straight  line  except 
for  gravity,  can  be  used  to  calibrate  the  facility.  Continued  collection 
of  residual  information  from  ongoing  tests  can  be  used  to  check  for 
reading  errors,  calibration  changes,  and  modeling  errors. 

SENSITIVITY  AND  EXPERIMENT  DESIGN.  The  last  point  to  be  considered  is  the 
sensitivity  of  deduced  results  to  experimental  setup  and  error.  This 
could  be  done  using  a  Monte  Carlo  approach,  by  perturbing  all  the  data 
points  in  a  random  manner,  analyzing  the  results,  and  repeating  this 
process  many  times  to  see  what  the  statistical  effect  on  the  results 
is.  This  process  is  time-consuming  and  not  really  required.  When  the 
differential  correction  procedure  is  used,  the  inverse  of  the  A  matrix 
represents  the  variance  and  co-variance  of  the  parameter  being  determined, 
hence  all  of  the  information  that  is  required  to  determine  the  sensitivity 
of  the  parameters  is  present  when  the  solution  is  obtained. 

The  above  procedure  can  also  be  looked  at  in  a  different  manner; 
namely,  what  combination  of  test  procedures  will  minimize  the  error  in 
a  particular  parameter.  This  was  done  for  a  simple  damped  sine  wave 
with  some  further  simplification  to  allow  a  closed-form  expression  to 
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be  obtained  for  the  expected  error  in  parameter  of  interest.  The 
results  of  this  are  plotted  in  Figure  17  ofr  the  damping  parameter 
The  expected  error  in  the  damping  parameter  is  plotted  versus  number  of 
cycles  of  motion  observed,  N,  for  various  number  of  data  points  per 
cycle,  n.  Also  shown  are  some  data  obtained  by  a  Monte  Carlo  approach. 
From  this  we  see  that,  if  the  number  of  stations  of  data  are  fixed,  we 
are  better  off  with  many  cycles  of  motion  and  only  a  few  points  per 
cycle.  Thus  we  see  that  we  can  also  use  our  data  reduction  procedure 
to  help  design  optimum  experiments. 

CONCLUSION.  In  the  foregoing  material  we  have  covered  in  a  rather 
comprehensive  manner  most  aspects  of  parameter  identification  as  it 
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applies  to  ballistic  range  data  reduction.  There  are  many  areas  where 
much  work  remains  to  be  done  but  the  procedures  are  sufficiently  well- 
defined  to  allow  aerodynamic  results  to  be  obtained  with  a  high  degree 
of  confidence  in  their  validity. 
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THE  WSMR  BEST  ESTIMATE  OF  TRAJECTORY  -  AN  OVERVIEW 

William  S.  Agee  &  Robert  H.  Turner 
Analysis  &  Computation  Division 
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White  Sands  Missile  Range,  New  Mexico 


INTRODUCTION 

The  Special  Projects  Section  of  the  Analysis  and  Computation  Division 
at  WSMR  has  developed  a  Best  Estimate  of  Trajectory  (BET)  program  for  use 
in  post-flight  data  reduction.  The  first  question  that  might  be  asked  is 
why  did  we  develop  a  BET  program?  One  reason  is  because  of  requests  from 
Range  Users  (we  are  currently  using  the  BET  for  LANCE,  SAM-D,  and  the 
forthcoming  621B  Navigational  Satellite  Tests).  However,  besides  these 
requests,  BET  has  several  advantages  over  the  conventional  single  instru¬ 
mentation  system  data  reduction  programs  currently  in  use.  In  order  to 
see  these  advantages  it  is  easiest  to  first  review  the  conventional  post¬ 
flight  reduction  procedure  and  note  its  deficiencies. 

The  primary  instrumentation  systems  at  WSMR  are  radar,  cinetheodolite 
or  fixed  camera,  and  dovap.  In  the  traditional  method  of  post-flight 
reduction  independent  estimates  of  trajectory  parameters  (Cartesian  • 
components  of  position,  velocity,  and  acceleration)  are  obtained  for  each 
of  the  primary  systems  observing  the  trajectory.  In  some  cases  all  three 
of  the  primary  systems  will  be  tracking  so  that  there  would  be  three 
independent  sets  of  position,  velocity,  and  acceleration.  These  independ¬ 
ent  estimates  are  bound  to  disagree.  This  disagreement  presents  a 
difficult  problem  for  the  trajectory  analyst.  Another  difficulty  is  that 
each  of  the  primary  systems  provides  measurements  only  for  a  portion  of 
the  trajectory.  For  example,  a  radar  often  will  not  provide  valid  track¬ 
ing  data  during  the  boost  phase  of  a  missile,  the  optical  measuring 
systems  run  out  of  film,  dovap  provides  unreliable  data  at  low  altitudes. 
Thus,  none  of  the  independent  trajectory  estimates  cover  the  entire 
trajectory.  In  addition,  requirements  on  accuracy  and  precision  some¬ 
times  cannot  be  met  by  reducing  data  from  a  single  instrumentation  system. 
Instead  of  developing  and  procuring  new  instrumentation  to  meet  these 
requirements  it  may  be  possible  and  certainly  more  economical  to  satisfy 
the  requirements  by  extending  the  capability  of  existing  instrumentation 
by  combining  data  from  the  various  instrumentation  systems.  Telemetered 
or  in-flight  recorded  data  can  also  be  included  in  the  BET  solution  to 


The  remainder  of  this  paper  has  been  reproduced  photographically 
from  the  author's  manuscript. 
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extend  the  capability  of  existing  instrumentation.  In  summary,  we  have 
pointed  out  three  serious  deficiencies  of  the  conventional  single  instru¬ 
mentation  system  type  of  data  reduction  procedure. 

1.  No  single  set  of  trajectory  parameter  estimates  is  obtained. 

2.  None  of  independent  sets  of  trajectory  parameter  estimates  cover 
the  entire  trajectory. 

3.  The  inefficient  use  of  measuring  resources. 

The  removal  of  these  deficiencies  are  the  basic  advantages  we  hope 
to  gain  by  use  of  a  BET  program.  In  addition  to  the  above  advantages 
the  BET  will  provide  feedback  to  the  instrumentation  people  in  the  form 
of  a  plotted  time  history  of  instrumentation  system  performance.  Time 
histories  of  measurement  bias,  variance,  and  residual  are  a  natural  out¬ 
put  of  the  BET  program.  These  are  plotted  for  each  measurement  partici¬ 
pating  in  the  trajectory  solution. 

TRAJECTORY  MODELLING 

As  stated  previously  one  of  our  primary  reasons  for  ieveloping  a  BET 
at  WSMR  is  to  provide  a  single  estimate  of  position ,  velocity ,  and  acceler¬ 
ation  through  the  combination  of  all  available  range  measurements.  Any 
technique  developed  for  this  application  must  apply  to  most  of  the  flight 
test  programs  at  WSMR  for  which  there  are  data  reduction  requirements. 
Obviously,  this  requires  the  use  of  a  rather  general  dynamic  model  of  the 
flight  test  trajectory. 

One  model  which  meets  this  requirement  is  the  second  order  polynomial 
model  presently  used  in  the  data  reduction  process.  In  state  variable 
form  this  model  is 
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This  very  simple  model  has  been  used  with  some  success  in  situations 
where  dynamics  of  the  process  are  not  too  severe  such  as  aircraft  tracking 
and  free  flight  missile  trajectories. 

We  have  had  much  more  success  with  a  dynamic  model  which,  rather  than 
x,  y,  and  z  as  states,  uses  acceleration  components  close  to  "where  the 
action  is"  namely  tangent  and  normal  to  the  trajectory.  Let  A^,  (tangent¬ 
ial  acceleration)  be  tangent  to  the  trajectory,  A^  (normal  acceleration) 
be  normal  to  A^,  and  lie  in  the  vertical  plane,  and  AL  be  normal  to  the  AT 
and  A^  directions  and  complete  right-handed  system. 


We  assume  that  these  acceleration  components  do  not  contain  the  effect  of 
gravity.  Using  these  accelerations  we  define  the  following  dynamic  model. 
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where 

Vq  =  (X42+X52)^2 
V  s  (XH2tX52+X*2)^2 

This  is  the  model  which  we  presently  use  in  the  BET  program.  Wote  that 
we  are  still  using  a  constant  acceleration  assumption  as  in  the  quadratic 
model  but  that  the  present  model  is  nonlinear.  One  reason  that  this  model 
is  considerably  more  useful  is  that  acceleration  measurements,  which  are 
made  aboard  the  test  vehicle,  are  usually  much  easier  to  model  in  terms 
of  tangential,  normal,  and  lateral  accelerations. 
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MEASUREMENT  MODELLING 


I 


Besides  modelling  the  trajectory,  the  measurement  must  also  be  modelled 
in  terms  of  the  trajectory  state  variables.  Thus,  for  each  measurement 
we  must  specify  a  nonlinear  measurement  function  h(x)  which  relates  the 
ideal  measurement  to  the  trajectory  state.  We  assume  that  the  position 
and  velocity  state  variables  ( ,  x2 ,  x8 ,  x* ,  x5 ,  Xj )  of  the  trajectory 
are  with  respect  to  a  coordinate  system  which  we  will  call  the  launch 
system. 

RADAR  MEASUREMENTS 

Radar  observations  are  usually  from  the  FPS-16  instrumentation  radars. 
These  radars  measure  range,  azimuth,  and  elevation  of  a  target  in  a 
local  radar  Cartesian  coordinate  system.  Some  of  the  radars  also  measure 
the  range  rate  of  the  target.  The  observed  range,  azimuth,  and  elevation 
(RAE)  are  first  corrected  for  calibration  and  refraction.  Direction 
cosines  are  computed  from  these  corrected  observations  and  then  related 
to  the  launch  coordinate  system  where  azimuth  and  elevation  angles  are 
recomputed.  In  terms  of  the  trajectory  state  variables  which  are  in  the 
launch  coordinate  system  the  radar  measurement  functions  are: 

RANGE 

h^x)  =  [(x1-xI)2+(x2-y].)2+(x3-zI)2]1^2 

AZIMUTH 


h2(x)  =  tan 
ELEVATION 


I 

I 


hg(x)  =  tan 


-1 


VZI 


t(x^-Xj)2+(x2-yj)2]^2 


where  (x^,  yj,  z^)  are  the  coordinates  of  radar  in  the  launch  coordinate 
system. 


115 


RANGE  RATE 


h4(x)  = 


x^Xj-Xj  )+x5(x2-yj  )+Xg(Xg-2  j ) 


"h^TxT 


OPTICAL  MEASUREMENTS 


The  fixed  cameras  and  tracking  cameras  measure  azimuth  and  elevation 
of  the  line-of-sight  to  the  target.  The  observed  angles  are  corrected 
for  calibrations  and  refractions.  The  measurement  functions  for  the 
cameras  are  the  same  as  for  the  radar  azimuth  and  elevation  measurements. 


h2(x)  =  tan 


I 

I 


hg(x)  =  tan 


-1 


VZI 


[(x^-Xj )2+(x2-yj  J2]1^2 


where  (x^ ,  y^,  Zj)  are  coordinates  of  camera  station  in  the  launch 
coordinate  system. 

DOVAP  MEASUREMENTS 

The  dovap  measuring  system  is  a  two-way  doppler  system.  The  basic 
digitized  measurement  is  the  doppler  cycle  count  over  the  sampling 
interval  (t^,  t2>,  which  when  properly  scaled  yields  the  change  in  loop 
range  from  transmitter  to  target  to  receiver.  If  in  addition  the  measure¬ 
ment  is  divided  by  (tg-t^,  the  result  is  the  average  loop  range  rate  over 
the  interval  which  approximates  the  instantaneous  loop  range  rate  at 


2 


Following  this  procedure  we  represent  the  dovap  observation  by  the 
measurement  function 


hg(x)  =  x4 


+ 


where  (x^,  yT,  z T)  and  6cR,  yR,  zR)  are  the  coordinates  of  the  dovap  trans¬ 
mitter  and  receiver.  The  quantities  R^  and 


RRare 


. . . . 


Rjr  C(x1-xT)2+(x2-yT)2+(x3-2T)2]1/2 

S  C(VXR)  +(x2‘yR)  +(x3_ZR)  ] 

VELOCIMETER  MEASUREMENTS 

The  velocimeter  is  much  like  the  dovap  except  that  it  is  a  one-way 
doppler  system.  Again  the  cycle  count  is  scaled  and  divided  by  the 
sampling  interval,  (t2-t^),  the  res'11*  interpreted  as  the  instantan¬ 
eous  range  rate  at  (t^+t2)/2.  The  resulting  measurement  function  is 

x4(x1-xI)tx5(x2-yI)tx6(x3-zI) 

6  [(xi~xi )2+(x2-yj )2+(x3-zJ  J2]1^2 

ACCELERATION  MEASUREMENTS 

Several  types  of  acceleration  measurements  are  possible.  Often  only 
the  longitudional  body  acceleration  of  a  missile  is  measured  and  telemeter 
ed.  If  the  missile  is  assumed  to  have  zero  angle  of  attack,  which  is 
often  a  good  assumption,  the  measurement  function  is  equal  to  the  tangent¬ 
ial  acceleration  A^  which  is  the  seventh  component  of  the  state  vector. 
Thus 

hy(x)  -  x7  (zero  angle  of  attack) 

Sometimes  three  orthogonal  components  of  missile  body  accelerations  are 
measured  and  telemetered  as  below 


If  the  missile  is  assumed  to  have  zero  angle  of  attack  and  zero  roll 
angle,  an  assumption  which  must  be  given  careful  consideration  for  each 
individual  case,  the  measurement  functions  are 


h?(x)  -  x?  = 

h8(x)  *  x8  =  *11 

h9(«)  *  x,  » 

Another  important  class  of  acceleration  measurements  cones  from  inertial 
measurement  units  (IMU).  This  is  the  type  of  measurements  on  the  621B 
Navigational  Satellite  Tests.  Thre  are  many  configurations  for  IMU 
measurements.  Some  IMU's  make  acceleration  measurements  in  a  coordinate 
system  slaved  to  the  local  vertical,  some  in  an  inertial  coordinate  sys¬ 
tem  setup  at  a  launch  point,  etc.  In  addition  to  acceleration  measure¬ 
ments  attitude  measurements  of  the  body  with  respect  to  the  reference 
coordinate  system  are  also  available. 

One  simple  type  of  IMU  measurements  which  we  have  processed  with 
the  BET  program  came  from  a  purely  inertial  system.  In  this  case  the 
inertial  system  was  aligned  with  the  aircraft  at  a  specified  time.  The 
future  accelerations  were  then  measured  in  this  coordinate  system.  This 
measurement  configuration  is  shown  below 


Although  inertial  platform  misalignments  and  drifts,  accelerometer 
scale  factor  errors,  and  accelerometer  zero  set  errors  must  be  modeled 
and  estimated,  we  assume  these  to  be  zero  for  the  present  discussion. 

In  absence  of  these  errors  the  acceleration  measurements  may  be  modelled 
in  terms  of  the  trajectory  accelerations  as 
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Where  M_  is  the  rotation  matrix  from  the  earth  fixed  launch  coordinate 
Ie 

system  to  the  inertial  system  and  M  t  is  the  velocity  dependent  rotation 
matrix  from  the  trajectory  coordinate  system  to  the  launch  coordinate 
system.  In  any  case  the  scalar  acceleration  measurements  are  linear 
functions  of  the  trajectory  accelerations 


hio(x) 
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where  the  vector  M  =  M(x)  is  state  dependent. 

EXTENDED  KALMAN  FILTER 

Our  BET  program  is  basically  an  extended  Kalman  filter.  For  the 
case  of  systems  whose  dynamics  are  linear,  the  measurement  functions  are 
linear  and  all  uncertainties  have  Gaussian  statistics,  the  Kalman  filter 
is  known  to  provide  the  optimal  recursive  estimate  of  the  state.  For 
nonlinear  systems  the  extended  Kalman  filter  obtained  by  linearizing 
the  nonlinear  functions  about  the  current  estimate  of  the  state  has 
become  a  popular  and  highly  useful  estimation  procedure  for  nonlinear 
systems . 

For  our  extended  Kalman  filter  we  assume  the  dynamic  trajectory 
model 

x  =  f (x)  +  w 

where  f(x)  was  previously  given  and  w  is  a  white  noise  term  with  zero 
mean  and  covariance  Q.  The  presence  of  the  state  noise  w  or  rather  its 
covariance  is  used  to  compensate  the  filter  gains  for  the  errors  made 
in  modelling,  in  particular  for  the  errors  in  the  trajectory  model  due 
to  the  constant  acceleration  assumption. 


Observations  z(K)  are  available,  at  discrete  instants  of  time  t^.  We 
assume  that  the  z(K)'s  are  statistically  independent  scalar  observations. 
The  processing  of  scalar  observations  provides  a  numerically  efficient 
as  well  as  intuitively  appealing  method  of  processing  the  observations. 

The  assumption  of  statistical  independence  of  the  observations  can  be  re¬ 
moved  if  necessary.  The  scalar  observations  are  represented  as 
Z(K)  =  h(x(K))tv(K) 

where  v(K)  is  a  measurement  noise  term  assumed  to  have  zero  mean  and 
2 

variance  r  (K). 

Let  x*(KlK-l)  denote  the  filtered  estimate  at  time  tv  after  processing 
all  observations  through  t„  , ,  and  x*(K)  the  filtered  estimate  at  t„  after 
processing  all  observations  through  t^.  Assuming  that  the  state  estimate 
x*(K-l)  has  been  computed,  the  predicted  state  estimate  for  the  next 
measurement  time  t^  is  computed  by  numerically  integrating  the  trajectory 
model  x  =  f(x)  using  a  second  order  Taylor  series  integration  procedure 

x*(K|K-l)  =  x*( K-l )+f ( x*( K-l ) )At  +J ( x* ( K-l ) )f ( x*( K-l ) ) ( At  ) 2 

K  —  K. 

where 

“k  5  Wi 

and 

j(x*(k-i»  =  r  ii] 

9X9  L  “J  x*(K-l) 

The  covariance  matrix  of  the  predicted  state  estimate,  which  satisfies  a 
matrix  Ricatti  differential  equation  between  tK1  and  t^,  is  computed  by 
using  a  trapezoidal  integration  procedure.  Let  P*  ^  denote  the  covariance 
of  x*^  ^)  and  P*jK1  the  covariance  matrix  of  x*(K|K-l).  Then  we  compute 
P*i 

Kl K-l  by 

PK I  K-l  S  VPK-l+,5QAtK)V‘,5QAtK 


2  <Atk)- 

1+ J(  x*(  K-l )  )AtK+Jn(x*  ( K-l ) ) - 


where 


and  Q  is  the  covariance  of  the  additive  state  noise  w. 

Our  extended  Kalman  filter  employs  a  matrix  square  root  formulation 
of  the  covariance  equations ,  see  Ref  1 .  We  have  found  that  the  square 
root  formulation  not  only  provides  a  numerically  stable  estimation 
procedure  but  it  is  computationally  efficient  as  well.  For  the  predicted 
covariance  matrix  P*)K  ^  computed  above,  the  matrix  square  root  Lk| k-1 
such  that 

T 

P*  =  L  L 
KIK-1  KIK-1  KIK-1 

is  computed  by  means  of  a  Choleski  decomposition,  see  Reference  4. 

For  each  scalar  observation  occurring  at  the  new  time  t^  an  updated 
state  estimate  and  an  updated  square  root  of  the  covariance  matrix  are 

/  *  v 

computed.  Let  x*ll'(K)  denote  the  state  estimate  after  processing  the 
i**1  scalar  observation  at  t„  and  let  denote  the  square  root  of  the 

covariance  of  x*'i;(K).  These  quantities  are  computed  from 
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MEASUREMENT  BIAS  ESTIMATION 

So  far  we  have  neglected  to  discuss  one  of  the  most  important  con¬ 
siderations  in  the  development  of  a  BET  program,  namely  to  account  for  the 
inconsistencies  produced  by  bias  errors  in  the  measurements.  There  is  a 
natural  way  of  including  bias  terms  in  the  extended  Kalman  filter;  one 
merely  adds  an  additional  state  variable  for  each  bias  term  to  be  consider¬ 
ed  and  forms  the  optimal  estimate  of  the  biases  in  the  .same  way  as  for  the 
trajectory  state  variables.  This  technique  is  fine  for  cases  where  there 
are  only  a  few  bias  terms  to  be  estimated.  However,  a  typical  applica¬ 
tion  of  our  BET  program  has  a  large  number  of  measurements  involved.  For 
example,  a  LANCE  flight  test  might  have  two  radars,  28  dovap  receivers, 
eight  fixed  cameras,  and  eight  cinetheodolites .  Considering  only  one  bias 
term  per  measurement  this  results  in  66  additional  state  variables  to  be 
estimated.  With  a  trajectory  state  dimension  of  nine  we  then  would  have 
to  compute  estimates  for  75  state  variables.  An  ordinary  Kalman  filtering 
program  using  75  dimensional  state  vector  is  computationally  prohibitive 
at  the  present  time.  Fortunately,  Friedland,  see  Ref  2,  has  developed 
a  decomposition  technique  for  Kalman  filters  which  we  were  able  to  adapt 
and  extend  to  the  measurement  bias  estimation  problem  at  WSMR.  The  appli¬ 
cation  of  this  decomposition  procedure  has  resulted  in  a  computationally 
feasible  BET  program  which  includes  estimation  of  measurement  biases. 

We  will  call  the  filter  described  previously  the  zero  bias  filter 
and  the  estimates  x^j  obtained  from  this  filter  the  zero  bias  estimates. 
Let  b  denote  a  p-vector  of  bias  terms.  We  revise  our  previous  measure¬ 
ment  model  to  include  these  terms. 

z. (K)  =  h.(x(K))+gT(x(K))b+v.(K) 

11  xlXp  1 

Thus,  we  allow  the  bias  of  each  measurement  to  be  a  linear  function  of 
several  bias  variables.  For  example,  a  model  for  the  bias  of  a  radar 
azimuth  measurement  might  be 
T 

AA  =  g  b  =  bjL+b2tanE0sinA0+b3tanE0cosA0+b4tanE0+b5secE0+bgA0 
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He  assume  the  constant  dynamic  model  for  the  bias 
b(K+l)  =  b(K> 

Note  that  we  have  not  included  a  state  noise  term  in  the  bias  dynamics 
to  account  for  the  possible  misassumption  that  the  biases  are  constant. 
The  reason  for  not  including  a  state  noise  term  will  become  evident 
later. 

Now  let  the  bias  state  vector  b  be  adjoined  to  the  trajectory  state 
x  to  form  the  augmented  state  vector  y 


<p+9)xi 

We  could  proceed  directly  and  obtain  a  new  extended  Kalman  filter  giving 
the  state  estimate 


However,  as  previously  mentioned  this  is  computationally  prohibitive  for 
large  p.  Instead  we  employ  the  filter  decomposition  procedure  developed 
by  Friedland.  This  procedure  attempts  to  write  the  optimal  estimate  x, 
which  includes  the  effect  of  biases,  as 

A  A 

x(K)  =  x*(K)+T(K)b(K) 

where  x*(K)  is  the  zero-bias  estimate  already  obtained  and  T(K)  is  a  9xp 
matrix  to  be  determined.  Upon  examination  we  find  that  the  decomposition 
holds  if  the  filter  satisfies  certain  restrictive  conditions.  The  details 
of  the  derivation  are  tedious  and  will  not  be  presented.  The  restriction 
imposed  by  the  decomposition  procedure  is  merely  an  assumption  we  have 
already  made:  The  bias  dynamics  must  not  include  a  state  noise  term. 

This  may  not  seer.i  like  much  of  a  restriction  since  it  was  assumed  to 
begin  with,  but  this  was  hindsight.  Indeed,  this  is  a  very  severe 
res  trice  ion  since  the  slate  noise  covariance  is  used  as  an  adjustable 
filter  parameter  to  account  for  mismodelling  errors.  Fortunately,  there 


is  another  way  of  accounting  for  mismodelling  errors  in  the  bias  dynamics 
for  which  the  decomposition  does  hold.  Specifically,  we  have  been  able 
to  extend  the  decomposition  procedure  to  the  case  of  a  fading  memory 
Kalman  filter  in  which  deweighting  of  past  observations  is  accomplished 
by  exponentially  weighting  past  residuals.  We  use  small  fading  factors 
to  account  for  bias  mismodelling  errors  and  use  the  state  noise  covariance 
of  the  zero-bias  filter  to  account  for  trajectory  mismodelling  errors. 

The  form  of  the  bias  filter  is  almost  identical  to  the  zero  bias 
filter  with  the  residuals  from  the  zero-bias  filter  forming  the  observa¬ 
tions.  Again  we  employ  the  square  root  formulation  for  the  filter.  At 
a  new  observation  time  we  have  the  prediction  equations 

A  A 

b(KIK-l)  =  b(K-l) 

(yKIK-1)  =  Cb(K-l) 

T(KIK-l)  =  *kT(K-1) 

A 

where  (^(K)  is  the  square  root  of  the  covariance  of  b(K),  4>K  is  the  trans¬ 
ition  matrix  from  the  zero-bias  filter  and  T(K)  is  the  combining  matrix 

of  the  decomposition.  For  each  observation  Z.  (K)  at  the  new  observation 

1 

time  new  bias  estimates  b  (K)  and  the  square  root  of  its  covariance 
cj^CK)  are  computed  from 

b(i)(K)  =  b|^j1)t  w*i)(K)(rJ<K>-«J<K>b[Jj1)) 

where 

r*(K)  =  z^(K)-h^x*|*  =  residual  from  zero  bias  filter 

/  •  \ 

and  w^' (K)  is  the  vector  Kalman  gain  given  by 

C<l-l>(K)cJ(l-1)(K),1(K) 


„U>(K>  .  — 

b  a^Kj+ijC* 


<K)Cb 


Tu-lJ 


(K)a. 


a*(K)  =  rJ(K)+HiLji"1,Lj^i“1,Hj  =  variance  of  residual 
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»i  ■  *l{*  <io 'J+HlTi-X<K> 
i  =  1,  ra  =  #  observations  at  TK 

TV  square  root  of  the  covariance  matrix  is  updated  at  an  observation  by 
C*l>(K)  =  c‘1"1)(K)ll-61c‘i'1,(K)»1(K)*J(K)C*1"1)(K)) 


V 


^K)S.(K))] 


gjdoc^’^docj^'^do^do 


for  each  measurement  the  combining  matrix  is  updated  according  to 
T.(K)  =  Ti_1(K)-wJi)Sj(K) 

T(K)  =  T  (K) 

Tft 

TQ(K)  *  T(KIK-I) 

Where  wi^K)  is  the  vector  gain  from  the  zero-bias  filter. 

*  (i) 

The  optimal  state  estimate  x  (K)  is  computed  as 

x(l)(K)  *  x*(i)(K)+Ti(K)b(l)(K) 

To  modify  the  above  equations  for  the  fading  memory  filter,  first 
choose  a  fading  factor  <*^>1.  The  above  equations  are  then  replaced  by 
Cb(K!K-l)  =  O^K-l) 

P*(KIK-1)  =  ^(c^VcK-D+.SQA^JtJ+.SQAl^ 

COMPUTATION  OF  OBSERVATION  VARIANCES 

For  each  of  the  scalar  measurements  Zi(K)  a  measurement  noise  variance 
ri2(K)  must  be  available  for  use  by  the  Kalman  filter  Several  possibilities 
exist  for  supplying  the  variance.  An  immediately  obvious  method  is  use  the 
variance  values  given  in  specifications  of  the  instrument  or  to  use  variance 
values  computed  from  past  performance  history  of  the  instrument.  These 
methods  are  most  useful  when  the  measurement  variance  is  stable  from  dav 
to  day  and  mission  to  mission.  Another  method  which  is  often  used  is  to 


compute  variances  from  measurement  residuals 
p.(K)  =  r^lO-sToOb^K-l) 

produced  in  the  BET  program.  A  method  for  computing  variances  from  the  re¬ 
siduals  wnicn  is  economical  in  both  computing  time  and  storage  is  the  lading 


memory  variance  estimate  defined  by  the  following  equations 

(pjGO-p.Gi-l)) 

p£(n)  =  p^n-l)  +  - p - 

n 

P  =  1  +  wP  , ,  P  =  1 
n  n-1*  1 

S^n)  =  wSi(n-l)+|l-^-j(pi(n)-pi(n-l)  2 


H 


P  -H  /P 
n  n  n 

1+w2H  .  ,  H.  =1 

n-1*  1 


o  S.(n) 

8i  <">  *  -r- 

n 

In  the  above  p.(n)  is  the  estimate  of  the  residual  mean,  w,  o£w<l  is  a 

*  2  th 

fading  factor,  and  8 ^  (n)  is  the  variance  estimate  for  the  i  measurement. 

An  optional  approach  to  the  estimation  of  the  measurement  variance  is 

the  fading  memory  variate  difference  technique,  see  Ref  3.  Let  y..(n) 

be  the  k**1  backward  difference  of  the  observation  Z. .  If  we  assume  the 

mean  of  the  K  differences  are  zero  the  following  equations  define  the 

fading  memory  variate  difference  method 


S.(n)  =  wS^(n-l)+y2(n) 


0.2(n) 


S.(n) 


I 
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A  COMPUTER  PROGRAM  TO  INVESTIGATE 
EXO-ATMOSPHERIC  ENGAGEMENTS  OF  INTERCEPTORS 
AND  RE-ENTRY  VEHICLES 


LTC  M.L.  Roberson,  CPT  C.  Van  Nostrand  and  W.A.  Barbieri 
Development  Branch,  ACS/Studies  and  Analysis 
HQ,  US  Air  Force,  Washington,  D,C. 


Computer  modeling  of  military  problems  is  accomplished  by  a  variety 
of  techniques.  The  most  common  is  monte  carlo.  There  are  disadvantages 
in  the  use  of  all  techniques  and  three  frequently  cited  in  the  use  of 
monte  carlo  are;  the  failure  to  calculate  the  number  of  replications  re¬ 
quired,  disregarding  the  effects  of  events  that  occur  only  at  extreme 
ends  of  distributions,  and  the  large  amount  of  computer  time  that  may  be 
required.  The  technique  of  building  a  variance-covariance  matrix  to  cal¬ 
culate  total  distribution  effects  from  one  iteration  minimizes  these 
disadvantages  while  maintaining  the  flexibility  of  application  that  is 
characteristic  of  monte  carlo  models.  This  presentation  describes  the 
application  of  this  technique  to  the  problem  of  investigating  the  inter¬ 
actions  involved  in  a  "one  on  one"  engagement  of  an  anti-ballistic 
missile  (ABM)  versus  a  re-entry  vehicle  (RV)  in  the  exo-atmosphere. 

First,  the  physical  aspects  of  the  problem  will  be  described.  Then  the 
application  of  the  solution  method  will  be  discussed.  Finally,  the 
computer  program  will  be  outlined. 

Assume  that  the  probability  of  kill  of  an  RV  by  an  ABM  is  a  function 
of  the  miss  distance  resulting  from  the  geometry  associated  with  a  given 
impact  point.  Figure  one  presents  the  environment  as  defined  by  the 
problem.  From  launch  to  impact  the  RV  trajectory  is  Keplerian  on  a 
round  non-rotating  earth.  The  launch  and  initial  impact  point,  the 
Missile  Site  Radar  (MSR)  location,  the  Parimeter  Acquisition  Radar(PAR) 
location,  and  the  ABM  launch  location  are  specified  in  latitude  and 
longitude. 

When  the  RV  is  within  the  specified  maximum  range,  maximum  off- 
boresight  angle,  and  minima*  elevation  angle,  the  PAR  begins  tracking. 

The  quality  of  each  tracking  point  is  a  function  of  RV  radar  cross- 
section,  off-boresight  angle,  range,  and  the  radar  quality  parameters 
such  as  bandwidth,  antenna  gain,  and  wave-length. 

Similarly  when  the  RV  is  within  maximum  MSR  range,  maximum  off- 
boresight  angle,  and  minimum  elevation  angle,  the  MSR  begins  tracking. 
Again  the  quality  of  each  tracking  point  is  a  function  of  RV  radar  cross- 
section,  off-boresight  angle,  range,  and  the  radar  quality  parameters. 
Tracking  data  is  taken  from  detection  until  the  last  midcourse  correction 
of  the  ABM. 


Preceding  page  blank 
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RV  TRAJECTORY 


OF  KILL  OF  RVs 
(AS  SPECIFIED) 


Constraints  may  be  placed  on  the  ABM  launch  such  as,  minimum  track 
time,  minimum  intercept  altitude,  intercept  within  MSR  coverage,  etc. 

When  these  constraints  are  fulfilled  the  ABM  is  launched.  From  burnout 
to  intercept  the  trajectory  is  assumed  to  be  nearly  Keplerian.  Tracking 
data  is  taken  from  burnout  to  the  last  midcourse  correction. 

The  problem  does  not  include  details  such  as  blackout  or  allocation 
of  radar  power  although  the  approach  taken  to  the  model  makes  it  readily 
accept  modifications  and  additional  equations.  The  four  major  areas  of 
concern  in  the  model  are: 

1.  The  geometrical  considerations  in  general  such  as  locations, 
fly-out  curves  and  times,  and  performance  constraints. 

2.  The  RV  tracking  points  and  position  prediction  error. 

3.  The  ABM  tracking  points  and  position  prediction  error. 

4.  Determing  the  P^  based  cn  prediction  errors. 

KEND  AND  EPERAN 

In  this  presentation  I  will  describe  the  mathematical  basis  for  two 
of  the  subroutines  KEND  and  EPERAN  used  in  EXO-1,  rather  than  describe 
them  in  detail  or  show  results  of  running  them.  [See  Chart  1] 

KENDALL 

The  basic  work  was  published  in  a  Rand  Report  for  ARPA  by  Dr.  William 
Kendall  in  February  1963  in  a  paper  entitled  "The  Probability  Distribution 
of  Anti-Missile  Missile  Miss  Distance  Due  to  Observations  and  Guidance 
Noise."  In  the  preface  he  mentioned  the  problem  stemmed  from  setting 
accuracy  specifications  for  the  observing  instruments  of  a  mid-course 
intercept  system  Intended  to  operate  against  submarine-launched  ballistic 
missiles  and  that  results  could  be  applied  to  a  wide  variety  of  intercept 
situations. 

In  the  problem,  it  is  assumed  that  the  position  of  the  target  at  some 
future  time  is  estimated  from  a  set  of  measurements  (possible  correlated) 
of  any  type,  and  that  the  error  in  the  future  position  estimate  is  related 
to  the  errors  in  the  measurements  by  the  usual  linear  relation.  This 
implies  that  if  measurement  errors  are  Gaussian,  future  position  errors 
will  also  be  jointly  Gaussian.  It  is  also  assumed  that  the  noise  in  the 
interceptor  guidance  system  leads  to  uncertainty  in  the  interceptor’s 
position  at  intercept  characterized  by  a  three-dimensional  Gaussian  distri¬ 
bution.  Under  these  assumptions,  the  probability  that  the  miss  distance 
between  interceptor  and  target  will  be  less  than  any  given  amount  is 
determined  analytically.  An  important  feature  of  the  solution  is  that  the 
effects  of  interceptor  guidance  error,  and  of  the  tactical  geometry  and 
measurement  accuracies  can  be  separated. 
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Here  is  the  situation  at  intercept.  [See  Chart  2]  The  target  actually 
moves  along  the  solid  line  and  at  the  time  of  intercept  is  at  position  T. 

The  measurement  system  takes  its  measurements  at  an  earlier  time,  and 
estimates  thst  the  target  will  be  at  position  W  at  the  time  of  intercept. 

The  interceptor  is  aimed  at  the  targets  estimated  position  at  time  t,  but 
due  to  guidance  errors  actually  arrives  at  position  I.  Z  is  the  interceptor 
guidance  error,  the  difference  between  where  it  was  aimed  and  where  it 
arrived.  X  is  the  target  position  error  at  intercept.  The  difference 
between  where  it  is  and  where  it  was  predicted  to  be,  Y  is  the  distance 
between  actual  locations,  the  miss  distance  and  the  vector  sum  of  the  two 
errors. 

In  chart  3  we  state  the  mathematical  problem.  If  measurement  errors 
have  jointly  Gaussian  distribution,  then  to  a  usually  good  approximation, 
errors  in  W,  i,e.,X,  are  jointly  Gaussian  with  covariance  matrix  cap  X. 

The  guidance  noise  induced  errors  in  the  rectangular  coordinates  of  the 
AMM  are  assumed  to  be  jointly  Gaussian  random  variables  with  zero  means 
and  covariance  matrix  Z.  Thus  the  rectangular  coordinates  of  the  difference 
in  actual  positions  is  x  +  z  =  y,  which  is  Gaussian,  zero  mean,  and  assuming 
statistical  independence  of  the  target  position  estimate  and  interceptor 
errors,  has  covariance  matrix  X  +  Z  ■  Y. 

The  miss  distance  squared  is  shown  in  terms  of  its  components.  [Charts 
4  and  5] 


re  (8) 


(1)  Moments  related  to  cumulants 


vi  ■  ki 


v2  -  k2  +k2 


V3  *  k3  +  ^ik2  +  ^l' 


Cumulants  defined  by 


"  (jv)  k. 


In  $  (v)  -  Z 


i-1  il 


(2)  Take  log  of  $(v) 

Identify  relating  det[I-2jvY]  to 

exp  {-L  tr  Y1} 


inf  series  in 
(jv)1 


(2jv)] 


In  *(v)  -  f  Z 
i-1 


trY 
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OUTLINE  OF  SOLUTION 


CHART  5 


re  (9)  This  is  of  the  form  of  the  generalized  chi-squared  distri¬ 
bution  with 


2  (v^)  degrees  of  freedom. 


1  i  2  < Vc5  -  ’ 


Parametric  solution  curves  are  displayed  in  chart  6, 


This  is  one  of  several  convenient  graphs  from  Kendall's  report.  On 
the  ordinate  is  the  probability  that  the  miss  distance  is  less  than  some 
value  k  and  as  the  abscissa  the  parameter  k/*^o.  Curves  are  parametric  in 
v  and  a  and  these  in  turn  are  simple  functions  of  the  covariance  matrix 
of  Y.  The  effects  of  target  position  prediction  and  interceptor  position 
are  separable. 


Application  to  our  problem 

2 

In  order  to  determine  V-  and  a  we  must  know  trY  and  trY  where[See 
Chart  7] 


Y  -  X  +  Z 


Kendall's  paper  also  deals  with  the  sensitivities  of  the  errors  in 
measurements  and  predicted  target  position.  In  our  case  EPERAN  provides 
the  covariance  matrix  of  the  target,  X,  directly.  Assume  that  noise  in 
Interceptor  is  such  that  the  position  uncertainty  at  time  t  has  spherical 
symmetry  (e.g.,  when  only  the  rms  value  of  the  interceptor  miss  distance 
is  known)  then 

Z  =  (a/q)I 

where  a  is  mean  squared  miss  distance  due  to  interceptor  guidance  noise. 
This  is  a  useful  approximation  when  the  interceptor  guidance  errors  are 
much  smaller  than  target  position  error,  and  the  non-spherical  nature  of 
the  interceptor  error  is  unimportant. 

2 

Trace  of  Y  is  the  sum  of  three  terms.  The  first  two  are  immediately 
available.  The  last  requires  a  matrix  multiplication  of  known  3x3. 

That  completes  the  description  of  the  calculation  of  the  probability 
distribution  miss  distance.  I  will  know  describe  the  subroutine  EPERAN 
which  provides  the  covariance  matrices.  [Chart  8] 

EPERAN  is  a  computer  program  for  determining  the  accuracy  of  instru¬ 
mentation  used  in  estimating  the  position  and  velocity  of  a  vehicle  in 
near-Kapler ian  motion. 
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APPLICATION 


CO 


IX 


ix 

JL. 

4-> 

+ 

X 

u 

+j 


ii  n 


>- 


i. 

+j 


0% 

CO 

+ 

_l 

E 

< 

rx 

0 

II 

X 

1— « 

n 

(X 

QC 

<T3 

U- 

LU 

fO 

X 

4-5 

zc 

Q. 

+ 

+ 

+ 

X 

C/0 

X 

X 

r* 

4-' 

CO 

►— H 

c. 

V — ✓ 

X 

cSl" 

n? 

C£ 

col  a* 

+-> 

c. 

S- 

CM 

O 

4-5 

4-> 

sc 

CC 

11 

it 

LU 

cc 

11 

11 

11 

II 

> 

LU 

IX 

X 

►— 1 

DC 

s- 

c* 

(N 

X 

CD 

4-5 

>- 

X 

X 

O 

S- 

s- 

cv 

X 

J— 

4-5 

4-5 

s~ 

Ol. 

4-' 

LU 

LU  <_> 
S  =£ 
ZD  LU 
OO  I— 
oo  2: 
<£  — 


'  "  "  ’  •/*  ' 


II 

r>i 

X 

s- 

4-5 


140 


MATRIX  MULTIPLICATION  OF  KNOWN  3x3 


An  early  version  of  EPERAN  was  described  In  a  Project  Rand  Report  by 
Gabler,  Belcher,  and  Johnson  in  April  1963,  entitled  "A  Computing  Program 
for  Determining  Certain  Statistical  Parameters  Associated  With  Position 
and  Velocity  Errors  for  Orbiting  and  Re-entering  Space  Vehicles'.’ •  An 
improved  version  appeared  two  years  later  in  RM-4740-PR  by  Gabler  and 
Belcher,  entitled  "A  Computer  Program  for  Tracking  Error  Analysis  of  Keplerian 
Trajectories".  With  some  minor  modification,  this  is  the  version  that  was 
used  in  this  application.- 

Any  computer  program  that  relates  errors  in  a  series  of  measurements 
to  the  covariance  matrix  of  position  errors  at  the  time  of  intercept  can, 
of  course,  be  used.  EPERAN  was  chosen  as  the  simplest  program  that  provided 
the  needed  calculation.  It  was  readily  available,  well  documented,  and  had 
been  checked  against  other  models  because  of  its  use  in  range  instrumentation 
analysis,  and  in  studies  of  space  tracking. 

In  EPERAN  it  is  assumed  that  estimates  for  orbital  parameters  have 
been  made  by  a  general  least  squares  (differential  correction)  routine 
which  weights  observations  Inversely  with  their  assigned  standard  deviations. 
All  partial  derivatives  used  in  error  propagation  are  obtained  from 
analytic  formulas.  Provision  is  made  for  multiple  tracking  stations  that 
move  along  great  circle  paths.  The  results  are  given  in  terms  of  a  co¬ 
ordinate  system  associated  with  the  trajectory  plane  of  the  tracked  vehicle. 

CALCULATION  TECHNIQUE 

In  the  process  of  statistical  estimation  of  parameters  by  maximum  like¬ 
lihood  or  by  general  least  squares,  a  covariance  matrix  is  usually  obtained 
as  a  representation  of  the  error  in  the  parameters.  The  inverse  of  this 
variance-covariance  matrix  is  called  the  information  matrix.  [Chart  9] 

It  was  found  to  be  more  convenient  to  use  the  information  matrix  than  the 
variance-covariance  matrix.  When  it  is  non-singular,  the  information 
matrix  may  be  inverted  to  give  the  variance-covariance  matrix. 

The  information  matrix  associated  with  a  column  vector  X  is  pro¬ 
pagated  to  a  column  vector  y  by  the  transformation 


where  S  is  the  information  matrix  for  vector  x,  S„  for  vector  y  and  A  is 
x  ’  y  J 

the  matrix  of  partial  derivatives. 

Each  piece  of  tracking  data  contributes  to  the  final  information  matrix 
associated  with  the  error  at  intercept  time.  The  contribution  is  determined 
by  the  variance  of  the  measurement  and  the  functional  relationship  between 
the  measurement  and  the  estimated  parameters. 
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For  example,  each  range  measurement  has  an  assigned  variance  of  the 

form 

2  ^2 

o  ■  d,p  +  d-  c.080  +  d. 
pi  3  e  4 

where  the  d.^  are  functions  of  the  (radar)  or  other  measurement  system.  0£ 

is  the  elevation  angle.  [Chart  10]  The  forms  allow  for  range  dependent  or 
independent  errors,  and  can  be  gotten  either  from  experimental  data  or  from 
theoretical  expressions  involving  signal  to  noise  ratio  and  pulse  shape 
factors.  The  information  is  measured  by  the  reciprocal  of  the  variance 


The  information  is  propagated  through  a  succession  of  column  vectors 
arranged  according  to  the  functional  relations  in  accordance  with  equations 
of  Chart  11. 

Finally,  the  contribution  of  each  piece  of  tracking  data  to  the  final 
information  matrix  is  accumulated  in  S  and  at  the  end  of  the  time  interval 

R  *  S-1  inverted  to  give  the  variance-covariance  matrix  of  position  and 
velocity  errors. 

is  the  required  covariance  matrix  of  position  errors. 

Eigen  values  and  50  and  95  percent  confidence  regions  associated  with 
position  and  velocity  errors  are  also  determined. 

Chart  12  shows  a  typical  input  setup.  Chart  13  shows  an  intermediate 
tracking  point  printout  and  the  R  matrix. 

In  summary,  then,  the  probability  of  kill  distribution  is  found  in 
terms  of  covariance  matrices  of  target  and  interceptor  errors.  The  covar¬ 
iance  matrices  are  calculated  from  the  set  of  measurements. 

Colonel  Roberson  will  now  describe  other  aspects  of  the  computer  model. 

PART  IV 

The  scenario  for  EX0-1  has  been  described  and  the  method  for  calculating 
the  effects  of  error  in  position  has  been  presented.  I  will  describe  the 
model  organization  and  show  characteristic  results. 

The  EX0-1  Model,  simulating  (ABM)  interceptors  controlled  by  radar 
engaging  a  ballistic  missile  re-entry  vehicle  (RV),  was  configured  from 
several  existing  programs.  These  programs,  outlined  on  Chart  14,  ares 


EACH  MEASUREMENT  HAS  ASSOCIATED  VARIANCE. 
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FLAP 

FOOTPRINT  AND 
RV  TRACK 


MC n !;L  ORGAN! /A  NON 


Redstone  Arsenal's  "Fast  Look  Analysis  Program"  (acronym  FLAP);  Rand 
Corporation's  programs  for  tracking  error  analysis  (EPERAN)  and  compu¬ 
tation  of  ABM  probability  of  kill  (KEND).  A  routine  was  written  to  pro¬ 
duce  the  geometry  of  the  ABM  flyout  curves  (GEOM). 

Flap  constructs  the  scenario  to  be  studied  and  calculates  the  RV 
Keplerian  track  up  to  the  earliest  possible  ABM  interception  point. 

Flap  also  calfculates  initial  detection  time.  This  procedure  is  repeated 
for  each  point  desired  around  the  maximum  intercept  foot  print.  A 
system  parameter  may  then  be  modified  and  the  new  set  of  detection  and 
intercept  points  and  times  will  be  calculated. 

Eperan  determines  the  accuracy  of  estimating  the  position  and  velocity 
of  a  vehicle  in  Keplerian  motion  by  the  method  just  discussed.  To  com¬ 
pute  that  portion  of  the  total  error  that  is  attributed  to  errors  in  the 
prediction  of  the  position  of  the  RV  at  expected  time  of  intercept,  Eperan 
gets  the  initial  tracking  conditions  (detection  position,  velocity  and  time) 
and  ending  time  (intercept  time)  from  FLAP,  The  initial  tracking  condi¬ 
tions  for  the  ABM  are  not  produced  by  FLAP.  Therefore  an  intermediate 
program  (GEOM)  is  used  to  calculate  the  burnout  velocity  vector  of  an  ABM 
boost  profile  to  the  RV  interception  point.  The  assumptions  made  in  GEOM 
are: 

1.  All  of  the  ABM's  fuel  is  consumed, 

2.  Flight  of  the  interceptor  after  burnout  is  in  a  vacuum. 

3.  The  earth  is  spherical  and  not  rotating. 

4.  Linear  interpolation  between  values  of  parameters  defining 
trajectories  to  two  adjacent  points  will  define  an  additional 
realizable  trajectory  to  a  point  in-between. 

From  Geom  and  Flap,  Eperan  gets  sufficient  conditions  to  compute  the 
error  in  the  prediction  of  the  position  of  the  ABM  at  expected  time  of 
intercept.  Eperan  also  requires  RV  and  ABM  cross  sections  and  radar  des¬ 
cription  parameters.  The  output  from  Eperan  in  each  case  is  a  variance 
co-variance  matrix  of  position  errors.  This  information  is  used  by  the 
KEND  program  to  calculate  the  mean  and  standard  deviation  of  the  miss 
distance.  Finally,  the  probability  of  kill  for  the  given  weapon  radius 
is  computed  and  printed. 

Chart  15  shows  a  maximum  coverage  foot  print  that  is  a  result  of 
FLAP.  The  additional  information  that  can  be  obtained  from  EXO-1  is 
shown  on  Chart  16  with  isoquants  super-imposed  over  the  maximum  coverage 
contour . 

What  we  have  presented  is  a  problem  that  was  solved  by  simulation. 
However,  from  the  start  it  appeared  to  us,  that  the  use  of  the  Monte 
Carlo  technique  would  require  such  extensive  computer  time  that  use  of 
the  model  would  be  greatly  restricted.  The  analytical  solution  via  the 
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variance  co-variance  technique  eliminated  this  problem.  EXO-1  on  the 
GE-635  computer  requires  about  1  minute  of  central  processor  time  to 
produce  the  probability  of  kill  at  one  point.  A  similar  program  for 
endo-atmospheric  interceptions  generating  30  Monte  Carlo  iterations 
requires  10  minutes  per  point.  We  estimate  that  the  effort  required  to 
build  a  computer  model  for  a  new  application  of  this  technique  could  be 
as  much  as  two  man  years. 

We  are  interested  in  any  solution  method  that  may  be  used  as  an 
alternate  to  Monte  Carlo  which  will  produce  reliable  information  in 
less  computer  time. 


EXO-1  REFFERENCES 

A  Computer  Program  to  Investigate  Exoatmospheric  Engagements  of  Interceptors 
and  Re-entry  Vehicles.  Washington,  D.  C,:  Air  Force  Assistant  Chief  of 
Staff,  Studies  and  Analysis,  Strategic  Unmanned,  1^71. 

Gabler,  R.  T.  and  S.  J.  Belcher,  A  Computing  Program  for  Tracking  Error  Anal¬ 
ysis  of  Keplesian  Trajectories.  The  RAND  Corporation,  RM-4740-PR,  October  1965. 

Kendall,  W.  B.,  The  Probability  Distribution  of  Anti-Missile  Missile  Miss 
Distance  Due  to  Observation  and  Guidance  Noise,  The  RAND  Corporation,  RM- 
3505-ARPA,  February  1963. 

The  EXO-1  briefing  was  given  by: 

Barbieri,  W.  A.,  The  RAND  Corporation,  2100  M  Street,  N.  W.  Washington, 

D.  C.  20037 

Roberson,  M.  L.  Lt.  Col.,  Asst.  Chief  of  Staff,  Studies  and  Analysis, 

Hq  USAF,  Washington,  D.  C.  20330 

Van  Nostrand,  C.,  Capt.,  Asst.  Chief  of  Staff,  Studies  and  Analysis, 

Hq  USAF,  Washington,  D.  C,  20330 
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THE  WIND  TUNNEL  FREE  FLIGHT  TESTING  TECHNIQUE 


A.  S.  Flatou 

Research  Aerospace  Engineer 
U.  S.  Army  Ballistic  Research  Laboratories 
Aberdeen  Proving  Ground,  Maryland 


Symbols 


L 

m 

Ma 

H,M,P,1 
T,G  J 
P 


q 

s 


Speed  of  sound,  ft/sec 
Model  reference  diameter,  ft 
Axial  moment  of  inertia  of  model 

Transverse  moment  of  inertia  of  model 

2 

Axial  radius  of  gyration  of  model  =  /1^/md 
Transverse  radius  of  gyration  of  model  ■ 

/I  /md2 

y 

Total  length  of  flight,  ft 
Mass  of  model,  Blugs  or  grams 
Mach  No .  Vja. 

See  section  entitled,  The  Yaw  Reduction 

Spin  rate  of  model  radians  per  second  or 

revolutions  per  minute 

Tunnel  stagnation  pressure,  psia 

Free  stream  dynamic  pressure  ■  5  p  UM 
or  missile  angular  rotation 
Dimensionless  distance  along  flight  path  = 


I  I  U« 


dt 


X 

Y 

Z 

P 

a 

0 

6 

5 


v 

0 

r 

u 

s 

c 

DO 

CL 

a 


°N 


o  2  2 

Area  -  nd*"/1*  feet  or  inches 

Time,  seconds  or  milliseconds  Q 
Tunnel  stagnation  temperature,  R 

Tunnel  static  temperature,  °R 
Free  stream  tunnel  air  velocity 

Model  velocity  in  X  direction 
Coordinates  axes,  X  is  along  tunnel  axis, 
positive  upstream;  Y  is  horizontal  and  per¬ 
pendicular  to  X;  Z  is  vertical  and  perpen¬ 
dicular  to  X,  positive  down;  (right  hand 
rule  applies  to  directions  and  rotations) 
Air  density,  slugs  per  ft3 
Angle  of  attack  in  X  Z  plane 
Angle  of  attack  in  X  Y  plane 
Total  angle  of  attack  or  yaw  =  @  +  10 
sin  6  , 

1  T  L/2  ,2 

square  yaw  = 

L  J -L/2 


Mean 


6  ds 


la 


Pd/2U„ 

Roll  angle 

f  ei0 
Viscosity 

Drag  coefficient  =  D/q  s 
Drag  at  zero  lift 

Lift  curve  slope  =  L/q  s  sin  6 

Normal  force  coefficient  *  N/q  s  sin  6 

Magnus  fore’  coefficient  =  M.F./q  s  v  sin  6 

Pitching  moment  coefficient  =  M/q  s  d  sin  6 


C.,  Magnus  moment  coefficient  =  M^/q  s  d  v  sin  6 

TO 

C.  v  +  C.  5  Rolling  moment  coefficient  ■  1/  q  s  d 
^P 

C.  (qd/2V)  +  CM  (dd/2V)  «  Damping  moment  coeffi- 
q  a  cient  =  M/q  s  d 

Re  »  p  U^  d/n 

C.G.  Location  of  C.G.  from  model  base,  $  of 
length 


p  sd 
2m 


C(i) 


Abstract 


The  free  flight  wind  tunnel  technique  has  been 
used  successfully  to  obtain  aerodynamic  coeffi¬ 
cients  on  a  variety  of  configurations  with  and  with¬ 
out  spin.  High  mass  to  moment  of  inertia  ratio 
models  are  electroformed  as  .001  inch  thick  nickel 
shells  with  tungsten  cores  at  the  center  of  gravity. 
This  permits  wide  C.G.  variations  and  provides 
models  which  can  withstand  the  high  speed,  high 
temperature  flows.  The  model  launcher  is  baaed  on 
the  principle  of  an  inverted  pea  shooter.  A  £-inch 
diameter  tube  is  inserted  into  the  aft  portion  of 
the  model  through  the  model  base,  and  compressed 
air  acting  through  the  tube  on  the  tungBter.  core 
propels  the  model  forward.  Spin  can  be  Imparted  to 
the  model  with  an  air  turbine  Just  prior  to  launch. 
Cones  have  been  launched  with  success  up  to  10,000 
rpm,  while  high  fineness  ratio  models  have  been 
launched  at  40,000  rpm  with  moderate  success. 

Introduction 


In  recent  years  a  free  flight  wind  tunnel  test¬ 
ing  technique  has  been  developed  which  is  advanta¬ 
geous  for  obtaining  aerodynamic  data  on  certain 
configurations^.  The  largest  and  most  desirable 
advantage  over  other  wind  tunnel  testing  techniques 
is  the  sting  interference- free  data,  and  relative 
ease  of  testing  long  bodies.  Advantages  over  range 
testing  techniques  are  the  large  number  of  cycles 
per  flight,  the  large  number  of  photographic  sta¬ 
tions  (up  to  500)  per  flight,  and  the  low  model 
accelerations  necessary  to  launch  the  model.  The 
effective  range  vU®  +  V)t  for  this  type  of  testing 
is  200  to  400  feet,  depending  on  the  tunnel  Mach 
number  and  the  time  of  flight.  The  most  popular 
use  of  this  free  flight  technique  has  been  for 
dynamic  stability  data;  however,  it  also  appears 
attractive  for  Magnus  and  roll  data  on  spinning 
configurations . 

Magnus  data  have  been  obtained  in  wind  tunnels 
on  some  ballistic  shapes  using  sting  supported 
models  and  strain  gage  balances^;  however,  in  some 
cases  sting  interference  is  indicated.  In  hyper¬ 
sonic  tunnels  the  complication  of  cooling  both 
bearings  and  balances  is  a  further  deterrent  to 
standard  type  Magnus  testing.  Roll  damping  mea¬ 
surements  in  a  wind  tunnel  have  depended  on  low 
friction  air  bearings,  and  here  again  free  flight 
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testing  avoids  this  complication.  It  is  the  object 
of  this  paper  to  present  the  BRL  (Ballistic 
Research  Laboratories)  plans  for  free  flight  test¬ 
ing  of  spinning  and  nonspinning  balllatic  shapes  in 
our  hypersonic  wind  tunnel. 

It  is  possible  to  obtain  all  of  these  measure¬ 
ments  (dynamic  stability,  Magnus  force  and  moment, 
and  roll  damping)  from  one  flight,  for  each  of 
these  measurements  requires  a  model  having  low 
moments  of  inertia  and  small  mass.  Dynamic  stabil¬ 
ity  measurements  require  low  transverse  moments  of 
inertia;  Magnus  measurements  require  low  mass,  and 
low  transverse  moments  of  inertia  and  roll  damping 
require  low  axial  moments  of  inertia.  Models  of 
this  type  are  built  as  thin,  lightweight  shells 
containing  a  heavy  metal  core  close  to  the  center 
of  gravity  of  the  model.  The  only  change  between 
a  nonspinning  dynamic  stability  model  and  a  spin¬ 
ning  one  to  obtain  all  three  pieces  of  data  is  the 
addition  of  spin  rate  counting  pins  in  the  base  of 
the  model  or  darkening  of  one  side  of  the  model. 


otical  System 


The  optical  system  is  used  to  obtain  a  series  of 
consecutive  pictures  of  the  model  in  flight  so  that 
the  trajectory  including  the  angle  of  attack  histo¬ 
ry  of  the  model  can  be  examined.  It  is  a  dual  path 
orthogonal  system  so  that  three  dimensional  coordi¬ 
nates  of  the  model  are  recorded  during  the  flight. 
The  orthogonal  system  makes  it  possible  to  obtain 
data  on  all  types  of  ballistic  shapes  including 
spinning  and  nonspinning  bodies  having  both  planar 
and  nonplanar  motions. 


Figure  1.  Instrumentation  for  Free  Flight  Testing 
in  the  BRL  Hypersonic  Tunnel 

The  optical  system  requires  two  35mm  Fastex 
cameras  mounted  (Figure  l)  with  the  side  view 
camera  photographing  the  vertical  motion  of  the 
model  and  the  top  view  camera  photographing  the 
horizontal  motion.  Both  views  see  a  silhouette  of 
the  model  against  lighted,  white  screens  containing 
the  measuring  reference  lines  (Figure  1).  Optical 
alignment  of  the  cameras  is  made  before  each  test 
period  and  is  such  that  the  model  coordinates  in 
each  individual  frame  can  be  read  to  .03  inch. 

Each  of  the  screens  is  illuminated  by  three  1000-W 
photographic  Sun  Gun  lights  located  approximately 
4  feet  from  the  screens.  For  correct  exposure  of 
the  film  (Tri-X-Negative) ,  the  camera  lens  is  set 
at  / 8,  using  a  2  inch  lens.  The  cameras  are 


approximately  60  inches  <,rom  the  tunnel  centerline . 
Identical  one-milliseccnd  timing  marks,  which  are 
coded  from  a  prescribed  zero  time,  are  placed  on 
each  film  during  the  flight.  In  this  manner  the 
time-motion  histories  from  each  film  can  be  linked 
together  so  that  the  three  dimensional  time-motion 
history  of  the  model  can  be  obtained. 


The  Models 

The  construction  of  the  models  1b  based  on  the 
fact  that  they  must  be  lightweight,  have  low  axial 
and  transverse  moments  of  inertia,  and  have  a  high 
mass  to  moment  of  inertia  ratio.  These  character¬ 
istics  are  obtained  by  forming  the  model  as  a 
lightweight,  thin-skin  shell,  with  a  heavy  metal 
core  placed  around  the  model  center  of  gravity 
(Figure  2).  These  models,  however,  must  withstand 
the  rigors  of  exposure  to  hypersonic  flow  for  a 
short  time,  so  that  certain  precautions  must  be 
taken  to  insure  this.  First,  it  is  estimated  that 
portions  of  the  model  may  reach  800°F  during  the 
flight;  and  second,  the  pressure  distributions  over 
the  model  may  be  quite  severe  as  the  model  emerges 
from  the  launch  chamber  into  the  main  tunnel  flow. 


Figure  2.  Free  Flight  Model  Design 

A  survey  of  the  field  shows  that  there  are  a 
number  of  materials  which  will  withstand  the  expect¬ 
ed  temperatures;  however,  there  are  only  a  few  of 
these  which  are  sufficiently  lightweight  to  keep 
the  moments  of  inertia  low.  The  most  numerous  high 
temperature  lightweight  materials  found  are  a  series 
of  porous  materials  made  by  Emerson  and  Cuming 
Company.  These  materials  have  specific  gravities 
from  .24  to  .75  and  will  withstand  temperatures  of 
at  least  800°F,  and  in  some  cases  3000°F.  They  are 
quite  fragile  though,  having  flexural  strengths 
between  500  and  1000  psi,  and  therefore,  must  be 
reinforced  in  order  to  withstand  the  pressure  dis¬ 
tributions  . 

The  first,  models  used  in  our  experiments  were 
made  of  thin  walled  steel  or  aluminum  tubing 
machined  doyn  to  3  mil  walls  with  the  nose  and  tail 
being  made .of  the  high  temperature  porous  material. 
Later  models  have  been  made  by  electroforming 
nickel  on  an  aluminum  mandrel  to  a  thickness  of  1 
or  2  milt  and  then  eroding  the  mandrel  leaving  the 
nickel  shell.  It  is  believed  that  thin  skinned 
electroformed  models  can  be  made  for  any  configura¬ 
tion  for  which  an  aluminum  mandrel  can  be  machined. 
To  date,  electroformed  models  of  10°  half  angle 
cones,  20  and  30  caliber  cone  cylinder  flares  and 
a  finned  configuration  (Figure  3)  have  been  made. 

The  first  models  used  lead  for  the  metal  core, 
while  later  models  have  used  tungsten.  Lead  is 
easier  to  machine  but  has  a  lower  mass  to  moment  of 
inertia  ratio  than  tungsten  for  the  same  geometric 
configuration.  Lead  also  lowers  the  maximum  allow¬ 
able  model  temperature,  thereby  increasing  the  risk 
of  losing  a  model  during  the  prelaunch  period  in 
the  hypersonic  tunnel.  To  secure  the  model  to  the 
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launcher,  one  end  of  a  5  mil  copper  wire  is  imbed¬ 
ded  into  the  metal  core. 


Figure  3-  An  Example  of  the  Models  Used  with  the 
Free  Flight  Launcher 


The  Launcher 

After  several  attempts  at  launching  models  into 
the  hypersonic  test  section  were  made,  it  became 
apparent  that  a  number  of  precautions  must  be  taken 
in  order  that  successful  launches  occur  a  large . 
percentage  of  the  time.  The  launcher  and  model 
must  be  protected  from  the  hot  high  speed  tunnel 
air  by  inclosing  it  within  an-  insulated  air  cooled 
chamber.  Doors  on  the  front  of  the  chamber  must 
open  and  close  automatically  during  the  launch 
cycle.  Symmetric  flow  conditions  in  the  door 
region  is  required  to  hold  down  velocity  attenua¬ 
tion  and  lateral  Jump.  Some  type  of  guidance  to 
maintain  model  attitude  during  the  acceleration 
phase  is  necessary.  For  spinning  models  a  turbine 
drive  must  be  incorporated. 


Figure  4.  Free  Flight  Launcher 

A  launcher  design  (Figure  4),  which  incorporates 
the  above  features,  has  been  developed  and  used  by 
the  Ballistic  Research  Laboratories  for  several 
months.  The  launcher  is  housed  in  an  insulated, 
air-cooled  chamber  which  keeps  the  ambient  tempera¬ 
ture  below  200°F,  while  the  tunnel  stagnation 


temperature  is  1300°F.  Doors  on  the  front  of  the 
chamber  automatically  open  before  the  launch  and 
close  after  the  launch  so  that  the  chamber  tempera¬ 
tures  remain  low.  The  launcher  is  mechanically 
simple  and  contains  very  few  moving  parts.  It  is 
unique  in  that  no  part  other  than  the  model  is 
accelerated  forward  during  the  launch,  and  the  model 
is  guided  during  the  full  period  of  acceleration  by 
an  internal  push  tube.  The  launcher  will  accept  any 
shape  model  from  i-inch  diameter  up  to  at  least  1 
inch  diameter,  and  up  to  7^  Inches  long;  the  only 
requirement  on  these  models  being  that  they  will 
accept  a  3/16-ineh  minimum  diameter  cy  'cal  push 
tube  mounted  through  the  model  base,  an  -ting 
against  the  metal  &!.'<?  which  is  common  to  all  of 
these  models.  Each  model  requires  a  pretest  assem¬ 
bly  and  wire  lashing  to  the  cylindrical  tube,  but 
can  be  readily  installed  on  the  launcher  just  prior 
to  launch.  The  launcher  is  a  single-shot  launcher, 
but  means  of  reloading  it  while  the  tunnel  is  run¬ 
ning  are  being  considered  now. 

The  principle  of  operation  of  the  launcher  is 
that  of  a  shock  tube.  A  normal  shock  is  generated 
by  breaking,  with  a  pricker,  an  aluminum  foil  dia¬ 
phragm  which  is  mounted  at  one  end  of  a  small  diam¬ 
eter,  hollow  cylinder  (Figure  4)  called  the  push 
tube.  The  strength  of  the  normal  shock  and,  in 
turn ,  the  model  launch  velocity  is  governed  by  the 
initial  pressure  in  the  supply  chamber  behind  the 
diaphragm.  The  normal  shock  travels  the  length  of 
the  push  tube  and  reflects  from  the  model  metal 
core  which  is  lashed  to  the  other  end  of  the  push 
tube .  The  lashing  is  accomplished  by  imbedding  a 
5-mil  diameter  knotted  copper  wire  in  the  metal  core 
and  soldering  the  other  end  of  the  wire  to  an  island 
mounted  in  the  center  of  the  push  tube.  When  the 
aluminum  diaphragm  is  broken,  the  increased  pressure 
on  the  metal  core  base  breaks  the  wire  at  the  knot, 
and  the  model  is  launched.  The  exact  aerodynamic 
action  which  takes  place  during  the  several  milli¬ 
seconds  required  for  the  model  to  leave  the  push 
tube  is  quite  complicated;  however,  it  is  known  that 
the  shock  is  traveling  at  much  higher  velocities 
than  the  model,  so  that  several  shock  reflections 
must  take  place  in  the  push  tube  during  the  launch 
time.  The  strength  and  number  of  shock  reflections 
are  a  function  of  the  cylinder  and  supply  chamber 
configuration,  and  therefore,  each  launcher  config¬ 
uration  must  be  calibrated  to  determine  the  correct 
pressures  for  launching  models  at  the  correct  veloc¬ 
ity.  The  launcher  will  launch  models  with  speeds 
from  20  feet  per  second  to  75  feet  per  second. 

To  adapt  the  nonspinning  launcher  to  a  spinning 
launcher  requires  that  a  custom-built  air  turbine  be 
inserted  between  the  aluminum  diaphragm  and  the  push 
tube.  The  air  turbine  is  designed  the  same  as  a 
dental  air  drill,  except  that  it  is  approximately 
four  times  larger  (Figure  4).  The  enlargement  is 
necessary  so  that  the  main  shaft  can  be  hollow  and 
form  the  first  portion  of  the  push  tube.  Also,  the 
enlarged  turbine  blades  increase  the  available 
torque  and  lower  the  maximum  allowable  spin  rate. 
Since  the  dental  drills  have  been  developed  for  spin 
rates  of  400,000  rpm,  the  enlarged  version  will 
still  easily  spin  up  to  50,000  rpm.  It  is  necessary 
to  place  a  labyrinth  seal  between  the  spinning  tur¬ 
bine  shaft  and  the  stationary  diaphragm  in  order  to 
prevent  large  attenuation  of  the  launch  pressure 
after  the  diaphragm  is  broken.  The  remainder  of  the 
push  tube,  including  the  model,  rotates  with  the 
turbine,  so  that  once  the  supply  air  passes  through 
the  diaphragm  it  is  subject  to  a  rotating  wall 


boundary  condition.  Monitoring  of  the  prelaunch 
spin  rate  will  be  accomplished  using  a  photoelec¬ 
tric  pick-up,  while  the  actual  spin  rate  during 
flight  will  be  read  from  the  high  speed  motion 
pictures . 


The  launching  procedure  for  both  the  nonspinning 
and  the  spinning  launcher  is  essentially  the  same. 
In  both  cases  the  model  is  placed  on  the  launcher, 
the  tunnel  is  started  and  brought  up  to  the  desired 
temperature  and  pressure.  Just  prior  to  launch  the 
predicted  supply  pressure  is  set,  and  in  the  case 
of  a  spinning  launch  the  model  is  brought  up  to  the 
desired  spin  rate.  At  this  point  the  actual  launch 
is  actuated  by  a  six-channel  timer  which  controls 
the  photographic  lights,  the  launch  pricker,  the 
••ameras ,  the  timing  mark  coder  and  the  launcher 
doors.  The  launch  is  triggered  by  the  pricker  rup¬ 
turing  the  diaphragn;  however,  all  of  these  events 
must  be  triggered  accurately  so  that  the  flight  is 
photographed  on  the  high  speed  portion  of  the  film. 


In  launching  models  into  the  tunnel  airstream  we 
wish  to  obtain  flights  in  which  the  model  will 
oscillate  through  the  maximum  number  of  cycles. 

The  number  of  cycles  is  given  by 


N  = 


_1  /jno; 

TT  ICp 


1 


and  Cp  are  fixed  by  the  configuration  being 

tested,  d  and  m/I  are  maximized  by  making  the  model 
as  large  as  possible  and  by  concentrating  as  much 
mass  as  possible  close  to  the  C.G.,  S  is  maximized 
by  finding  the  longest  path  or  the  longest  flight 
time.  The  longest  flight  path, is  obtained  by  ad¬ 
justing  the  launch  velocity  so  that  the  model 
reaches  the  upper  portion  of  the  test  rhombus  and 
return1-  downstream  before  gravity  has  time  to  pull 
the  model  out  of  the  test  rhombus.  The  launch 
velocity  for  constant  deceleration  is  =  ,/2S  D/m 


and  the  time  of  flight  is  t  = 


For  the  10°  half  angle  cone  models  the  D/m 
values  range  from  150  to  30°  ft/sec^  for  the  cases 
tested.  The  launch  velocities  are  40  to  50  ft/sec 
and  the  flight  times  are  such  that  the  models  would 
not  quite  fall  through  the  test  rhombus  due  to 
gravity.  For  the  high  fineness  ratio  models  the 
D/m  values  range  from  50  to  100  ft/sec^,  and  the 
launch  velocities  are  in  the  range  15  to  25  ft/sec. 
This  plus  the  relatively  long  push  tube,  especially 
for  the  3°  caliber  models,  decreased  the  launch 
pressure  such  that  breaking  of  the  lash  wire  is 
marginal. 


To  oveicome  this,  the  launching  system  is  being 
modified  so  that  the  launch  occurs  in  two  steps. 
First,  the  present  launcher  is  given  a  vertical  and 
a  downstream  horizontal  velocity.  Second,  the 
present  launcher  is  activated  so  that  the  resulting 
model  launch  motion  is  forward  and  upward  with  the 
model  at  low  angle  of  attack. 

The  model  will  be  launched  from  tie  lower  down¬ 
stream  portion  of  the  test  rhombus  such  that  the 
motion  will  carry  the  model  to  the  top  forward 
portion  of  the  rhombus  before  gravity  and  the  drag 
pull  the  model  down  towards  the  launch  point. 

The  reason  for  the  initial  horizontal  downstream 


velocity  imparted  by  the  motion  of  the  launcher  is 
to  increase  the  launch  velocity,  Vq,  thus  increasing 

the  required  launch  pressures  insuring  breakage’  of 
the  lashing  wire . 


Data  Reduction 

Data  reduction  of  each  model  launched  in  the  wind 
tunnel  is  accomplished  by  fitting  the  time  motion 
history  of  the  flight  to  the  equations  of  motion  for 
a  ballistic  missile.  The  reduction  is  basically  the 
same  reduction  used  by  the  BRL  aerodynamic  rangesS***. 
However,  due  to  differences  in  methods  of  recording 
the  data  and  other  characteristics  of  the  test,  some 
changes  in  the  reduction  procedure  are  necessary. 

The  time  motion  history  is  obtained  from  the  orthog¬ 
onal  high  speed  films  while  the  equations  of  motion 
for  the  free  flight  missiles  are  derived  in  refer¬ 
ence  4.  Fitting  the  equations  of  motion  to  the  data 
has  been  programmed  on  the  BRL  high  speed  computer, 
BRI£SQUE ,  so  that  rapid  reduction  of  the  data  to 
aerodynamic  coefficients  is  possible.  The  reduction 
is  separated  into  several  steps  so  that  various 
aspects  of  the  flight  can  be  considered  separately. 


•Tunnel  Aerodynamic  Conditions 

This  portion  of  the  program  concerns  the  reduc¬ 
tion  of  the  tunnel  operating  conditions  to  param¬ 
eters  which  can  be  used  for  reduction  of  the  model 
geometric  data  to  aerodynamic  coefficients.  The 
tunnel  aerodynamic  reduction  uses  the’ compressible 
fluid  flow  equations  for  a  de  Laval  nozzle  which  are 
outlined  and  tabularized  in  reference  5-  Quantities 
which  are  either  set  during  the  tunnel  test  or  com¬ 
puted  from  the  tunnel  conditions  are:  Mach  number, 
stagnation  pressure,  temperature,  dynamic  pressure, 
test  section  air  velocity  and  Reynolds  number. 


Model  Geometric  Reduction 


The  geometric  data  reduction  consists  of  the 
evaluation  of  the  motion  of  the  model  to  define  the 
space  coordinates  of  the  model  center  of  gravity  and 
the  variation  of  angle  of  attack  (3  +  io)  as  a 
function  of  time.  The  x,  y  and  z  coordinates  of  the 
model  nose  and  base  are  read  from  each  of  the  Fastex 
films  (Figure  5)’using  an  optical  film  comparator. 
These  data  are  used  to  determine  the  lateral  motion 
of  the  model  center  of  gravity  and  the  angle  of 
attack  motion  of  the  model.  The  film  speed  or  the 
frame  time  is  obtained  from  the  1000  cycle  timing 
marks  which  have  been  placed  on  the  side  of  each 
film  during  the  operation  of  the  Fastex  cameras. 

The  time  histories  obtained  in  the  geometric  data 
reduction  define  the  complex  yaw  (3  1  icr)  and  com¬ 
plex  swerve  motion  (y  +  iz)  of  the  model  in  flight 
and,  also,  define  the  velocity  and  spin  variation 
with  time.  Examples  of  this  information  are 
presented  in  Figures  7,8,9  and  10.  The  data 
accuracy  is  governed  by  the  resolution  of  the  Fastex 
cameras,  which  in  this  case,  correspond  to  coordi¬ 
nate  accuracies  of  0.03  inch.  The  results  of  the 
model  geometric  data  reduction  are  used  as  input 
information  for  the  remainder  of  the  data  reduction. 
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Figure  5.  High  Speed  Photographs  Taken  for 
Measuring  the  Model  Coordinates 


Drag  and  Roll  Reduction 

The  drag  and  roll  reduction  are  two  separate 
reductions;  however,  since  the  procedures  are  the 
same  and  their  reductions  are  independent  of  the 
remaining  reductions,  they  can  be  discussed  as  a 
group. 

The  average  drag  force  acting  on  the  model 
during  the  flight  can  be  obtained  by  assuming  con¬ 
stant  deceleration  during  the  flight.  Constant 
deceleration  is  not  strictly  true  for  it  will  vary 
with  angle  of  attack;  however,  this  variation  is 
within  the  accuracy  of  the  data.  Computing  the 
deceleration  (least  square  fit)  from  the  velocity 
time  curve  (Figure  6)  and  using  Newtons  equation 

D  =  -m  X 

<T>~h 


Figure  6.  Velocity-Time  Curve 
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and  since  CL  =  <L  ♦  CL  5 
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we  can  obtain  Cn  once,  two  or  more  flights  of  the 
0 

same  configuration  have  been  made  at  different  mean 
square  yaw  values  (Figures  10  and  ll). 


Figure  7.  Spin-Deceleration  Curve 
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The  roll  reduction  depends  on  measuring  the  roll 
deceleration  of  the  model  during  the  flight.  This 
is  done  by  measuring  the  rate  of  rotation  of  a  white 
strip  on  the  model  or  the  counting  pins  (Figure  7). 
Again,  assuming  constant  deceleration,  the  rolling 
moment  can  be  obtained  from 


Along  with  the  average  drag  computation  it  is 
necessary  to  compute  the  mean  square  yaw  so  that 
the  drag  curve  for  the  configuration  can  be  obtain¬ 
ed.  The  mean  square  yaw  is  obtained  from: 


v  +  C. 


q  s  d 


C.  is  the  rolling  moment  due  to  fin  cant  and  needs 
only  to  be  considered  for  finned  configurations. 
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Figure  10.  The  Drag  of  the  10  Half  Angle  Cone 
d  =  .9" 


The  Swerve  Reduction 

The  swerve  reduction  analysis  studies  the  later¬ 
al  motion  of  the  model  center  of  gravity  and  deter¬ 
mines  the  lift,  Magnus  and  damping  forces  acting  on 
the  model.  The  reduction  involves  fitting  the 
equation  of  motion  of  the  center  of  gravity  lateral 
coordinates  to  the  actual  lateral  motion  of  center 
Figure  9.  Swerve  Plot  0f  gravity>  ^his  equation  of  motion  is  equation 

9.8  of  reference  U. 


The  Yaw  Reduction 


The  yaw  reduction  is  designed  to  obtain  from  the 
model  motion  the  pitching  moment,  the  damping 
moment,  and  the  Magnus  moment  coefficients.  This 
is  accomplished  by  fitting  the  complex  yawing 
motion  of  the  model  (Figure  8)  to  the  equations  of 
motion  of  the  model. 

I'*  +  (H  -  i  P)  V  -  +  i  P  T)  ?  -  G 
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The  above  is  equation  6.8  of  reference  1*  where 
its’  derivation  is  shown.  It  is  too  lengthy  to 
present  here,  and  it  will  have  to  suffice  that  it 
is  similar  to  the  more  familiar  planar  equation  of 
motion 


I  5+|AQ’  +  M  =  0 

y 

Equation  6.8  takes  into  account  the  nonplanar 
motion  of  the  model  and  for  ease  of  reduction  the 
derivatives  have  been  taken  with  respect  to  dis¬ 
tance  rather  than  with  respect  to  time. 

The  fit  of  the  equation  of  motion  to  the  complex 
yaw  data  is  illustrated  in  Figure  8. 
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Bq,  are  complex  constants. 

The  Coriolis  force  term  has  been  omitted  because 
in  the  wind  tunnel  free  flight  technique  the  motion 
of  the  model  with  respect  to  the  earth  is  negligi¬ 
ble.  An  example  of  the  swerve  fit  to  the  actual 
motion  is  shown  in  Figure  9-  Again,  for  further 
explanation  of  this  reduction  procedure,  the  reader 
is  referred  to  reference  U. 


Single  Plane  Reduction 

Recently  we  have  started  modifying  one  of  our 
supersonic  tunnels  M  =  1.25  to  5-0  so  as  to  in¬ 
crease  our  free  flight  capabilities  to  the  complete 
supersonic  speed  range.  In  order  to  obtain  ortho¬ 
gonal  views,  it  would  entail  considerable  effort  to 
locate  a  viewing  window  in  the  ceiling  of  the  test 
section.'  In  an  attempt  to  eliminate  this  complica- 
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tion  we  are  investigating  the  possibility  of  using 
a  single  view  reduction  system  in  which  only  the 
projected  coordinates  of  the  model  in  the  X-Z  plane 
would  be  recorded.  By  solving  only  the  imaginary 
part  of  the  epi cyclic  equations  of  motion 

t"  +  (H  -  i  P)  ?'  -  (M  +  i  PT)  X  ‘  G 

the  aerodynamic  coefficients  will  be  obtained.  The 
ease  and  accuracy  of  this  solution  is  now  being 
compared  for  us  with  the  full  viewing  system  method 
by  the  University  of  Notre  Dame. 
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Figure  11.  Drag  of  Cone  Cylinder  Flare 
Configuration 

Model  Flights  and  Aerodynamic  Data 

A  number  of  flights  of  10°  half  angle  cone 
models  and  high  fineness  ratio  cone  cylinder  flare 
models  have  been  reduced  and  analyzed.  Most  of  the 
flights  have  been  made  at  M  ■  9-2  and  have  included 
variable  Reynolds  number,  variable  center  of 
gravity  and  variable  spin.  The  cone  models  includ¬ 
ed  both  flat  base  and  hemispherical  base  configura¬ 
tions  in  order  to  determine  the  variation  in  the  g 
pitch  damping  moment  due  to  the  base  configuration  , 
and  the  cone  cylinder  flare  models  included  length 
to  diameter  ratios  of  15  to  30*  Launching  the 
cones  proved  to  be  quite  successful.  However,  some 
difficulties  were  experienced  in  launching  the  high 
fineness  ratio  bodies.  The  longer  launching  dis¬ 
tance  and  in  the  case  of  spinning  models,  difficulty 
of  controlling  the  dynamic  balance  proved  to  be 
critical.  Long  models  having  shorter  launch  dis¬ 
tances  and  better  balance  are  now  being  fabricated . 

The  aerodynamic  data  obtained  on  these  model*  is 
shown  in  Figures  10,  11,  12  and  13.  The  drag  coef¬ 
ficients  are  linear  with  the  mean  square  yaw  for 
the  cone  cylinder  flare  configuration  and  up  to  15° 
for  the  10°  cone.  The  pitching  moment  coefficients 
for  all  of  the  models  are  constant  over  the  Reynolds 
number  range  tested  (Figures  12  and  13),  while  the 
pitch  damping  coefficients  for  all  models  vary  with 
Reynolds  number.  The  flat  base  cone  pitch  damping 
increases  with  Reynolds  number,  while  the  hemisphe¬ 
rical  base  cone  data  indicate  decreased  damping 
with  Reynolds  number,  but  a  large  increase  in  damp¬ 
ing  over  the  flat  base  cone.  The  data  also  indi¬ 
cate  that  the  cone  base  configuration  may  not  in¬ 
fluence  the  pitch  damping  at  higher  Reynolds  num¬ 
bers.  These  data  also  agree  with  data  obtained  at 
7  8 
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Figure  12.  Reynolds  Number  Variation  of  Pitching 
and  Damping  Moments  on  the  10°  Half 
Angle  Core  M  «  9.2 
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Figure  13.  Reynolds  Number  Variation  of 

Pitching  and  Damping  Moments  on  the 
15  Caliber  Cone  Cylinder  Flare 
Configuration  M  «  9.2 


coefficients  decrease  with  Reynolds  number  and  the 
indication  is  that  the  pitch  damping  will  become 
positive  at  slightly  higher  Reynolds  numbers. 

Magnus  and  roll  derivatives  have  also  been  obtained 
but  are  so  far  too  few  to  be  presented  here. 


While  analyzing  the  epicyclic  motion  of  these 
models,  it  became  apparent  that  the  epicyclic  arm 
rates  of  rotation  ranged  from  5  to  60  revolutions 
per  second.  These  rates  are  in  the  same  range  as 
a  considerable  amount  of  the  turbulence  of  most 
wind  tunnels  which  indicates  the  possibility  of 
turbulence  influence  on  the  epicyclic  motion  and 
the  computed  aerodynamic  coefficients.  An  examina¬ 
tion  of  the  pitching  moments  and  pitch  damping 
moments,  of  the  flights  obtained  to  date  indicate 
that  the  tunnel  turbulence  can  Influence  these  coef¬ 
ficients  provided  that  the  arm  rates  are  below  20 
cycles.  Further  tests  where  the  tunnel  turbulence 
level  is  lower  are  planned  for  the  future . 


Conclusions 

The  analysis  of  the  flights  made  to  date  are 
very  encouraging.  The  free  flight  technique  per¬ 
mits  the  evaluation  of  dancing,  Magnus,  and  roll 
derivatives  on  all  types  of  configurations  without 
the  presence  of  a  sting.  Flights  in  the  BRL  wind 
tunnel  have  Included  spinning  models  with  spin 
rates  up  to  U0,000  rpm.  So  far,  configurations 
have  been  limited  to  10°  half  angle  cones,  1  inch 
base  diameter,  and  15  to  30  caliber  cone  cylinder 
flares,  £  inch  body  diameter.  Low  model  weights 
and  low  moments  of  inertia  have  been  achieved  by 
electroforming  the  model  shells  from  .001  inch 
thick  nickel.  The  model  fabrication  technique 
assures  us  of  being  able  to  electroform  any  model 
shape  for  which  an  electroforming  mandrel  can  be 
made.  The  reduction  procedure  permits  computation 
of  all  of  the  aerodynamic  coefficients  including 
the  damping,  Magnus  and  roll  derivatives. 
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SOME  ASPECTS  OF  REGULARIZATION  AND  APPROXIMATION  OF 
SOLUTIONS  OF  ILL-POSED  OPERATOR  EQUATIONS1 
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The  University  of  Wisconsin,  Madison 


ABSTRACT.  In  this  paper  we  attempt  a  classification  of  the  various 
approaches  that  have  been  proposed  for  the  investigation  (solutions  and 
approximations)  of  ill-posed  problems.  The  classification  is  only  descrip¬ 
tive;  we  do  not  go  into  technical  details.  The  author  hopes  to  prepare  a 
more  extensive  survey  of  regularization  methods  for  operator  equations 
where  the  technical  aspects  will  be  considered.  The  bibliography,  although 
extensive,  is  not  intended  to  be  complete. 

1.  ILL-POSED  PROBLEMS:  REMARKS  AND  EXAMPLES.  The  notion  of  a  well- 
posed  (correct  properlyposed)  problem  introduced  by  Hadamard  at  the 
beginning  of  this  century  plays  an  important  role  in  the  theory  and  num¬ 
erical  approximation  of  various  operator  equations  arising  from  problems 
of  mathematical  physics,  engineering,  and  analysis. 

Let  X  and  Y  be  two  metric  spaces,  and  let  A  be  an  operator  on  X  into 
Y.  The  operator  equations  Ax  =  y  is  said  to  be  well-posed  by  Hadamard' s 
definition  if  the  following  conditions  hold: 

(i)  there  exists  a  solution  of  the  equation  for  all  y  e  Y; 

(ii)  the  solution  is  unique  in  the  space  X; 

(iii)  the  solution  depends  continuously  on  the  right-hand  side  y. 

If  any  of  these  conditions  are  not  satisfied,  then  the  operator  equation 
is  said  to  be  ill-posed  (incorrect,  improperly  posed). 

The  first  requirement  of  well-posedness  means  that  the  problem  is  not 
overdetermined,  and  superflows  conditions  are  not  imposed.  The  third  require¬ 
ment  of  well-posedness  is  important  in  problems  of  mathematical  physics 
and  natural  phenomena  since  the  data  y  is  obtained  from  measurements  made 
with  instruments  and  is  therefore  known  only  approximately.  The  requirement 
guarantees  that  a  small  error  in  y  cannot  produce  a  big  change  in  the 
solution  x. 
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D-462,  while  the  author  was  a  visiting  member  of  the  Mathematics  Research 
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It  should  be  emphasized  that  the  notion  of  a  well-posed  problem  for 
a  given  operator  depends  on  the  spaces  considered.  Thus  the  operator 
equation  Ax=y  may  be  well-posed  relative  to  the  spaces (X,Y),  while  being 
ill-posed  relative  to  the  spaces  (X',Y'),  i.e.  when  A  is  considered  as 
a  map  on  X'  into  Y',  so  that  the  data  is  drawn  from  Y',  and  the  notion  of 
continuity  is  that  induced  by  the  metrics  on  X*  and  Y',  Mathematically 
this  is  clear  since  a  mapping  may  be  continuous  relative  to  one  topology 
while  failing  to  be  so  in  another.  Physically,  this  gives  a  clue  as  to 
the  type  of  measurements  that  are  meaningful,  and  provides  a  framework 
where  small  errors  in  measurements  are  tolerable. 

Hadamard  gave  an  example  of  a  problem  (the  Cauchy  problem  for 
Laplace's  equation),  which  has  now  become  classical,  which  is  ill-posed 
in  any  of  the  usual  function  spaces  (the  space  of  continuously  differenti¬ 
able  functions,  L  spaces,  Sobolev  spaces  spaces  H9  of  analytic  functions 

on  the  unit  disc., etc.).  On  the  basis  of  this  example,  Hadamard  concluded 
that  this  problem,  and  other  problems  exhibiting  a  similar  dependence  of 
the  solution  on  the  data,  do  not  correspond  to  any  real  formulations,  i.e. 
they  are  not  problems  of  mathematical  physics.  In  other  words,  there  is 
something  wrong  with  the  mathematical  model  and  not  with  the  physical 
problems  which  it  portrays.  It  was  discovered  later,  however,  that 
Hadamard' s  conclusion  was  erroneous  and  that  many  situations  in  physics 
and  in  the  study  of  natural  phenomena,  as  well  as  in  some  areas  of 
analysis,  lead  to  problems  which  are  ill-posed.  We  mention  next  some 
examples  of  ill-posed  problems  which  appear  frequently  in  the  literature. 

Some  of  these  were  cited  by  the  Soviet  Academician  A.N,  Tikhonov  in  his 
invited  address  at  the  International  Congress  of  Mathematicians  in 
Moscow  in  1966. 

Some  Classes  of  Ill-Posed  Problems  in  Analysis 

1.  Determination  of  a  uniform  approximation  to  the  derivative  u' 
under  approximate  data  in  the  metric  of  the  space  of  continuous  functions. 

2.  Determination  of  the  sum  of  a  Fourier  series  at  a  given  point  in 
terms  of  arbitrary  values  in  the  space  f°r  the  Fourier  coefficients. 

3.  Uniform  approximations  of  solutions  of  integral  equations  of  the 
first  kind  and  other  problems  leading  to  them  (analytic  continuation, 
conformal  mapping,  operational  calculus  in  the  real  domain)  under  perturba¬ 
tion  of  the  data  in  the  metric  of  (the  space  of  square-integrable  functions). 

4.  Numerical  inversion  of  Laplace  transforms. 

5.  The  problem  of  determining  the  input  to  a  system  when  we  know  the 
impulse  response  and  the  output. 

6.  A  wide  range  of  unstable  problems  of  optimization  (unstable 
problems  of  optimal  control  and  filtering,  linear  and  dynamic  programming). 
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7.  Linear  problems  on  the  spectrum  under  the  usual  supplementary 
conditions  determining  a  unique  solution. 

8.  Ill-conditioned  algebraic  systems,  Fredholm  alternative  problems 
with  nonuniqueness,  etc. 

9.  Some  classes  of  "inverse  problems".  A  typical  situation  here 
is  of  finding  Z,  which  is  rot  accessible  to  direct  measurement,  from  a 
physically  determined  manifestation  u  of  Z,  where  u  =  AZ,  and  A  is  a 
completely  continuous  operator.  Then  the  inverse  of  A,  is  not  continuous. 

Note  that  there  is  a  considerable  overlap  among  the  classes  mentioned 
above,  and  that  from  the  operator-theoretic  point  of  view  many  of  these 
classes  share  the  same  characteristic. 

Some  Areas  of  Real  Phenomena  Which  Lead  to  Ill-Posed  Problems 

(1)  Some  classical  problems  of  mathematical  physics:  for  example 
the  Cauchy  problem  for  nonnegative  time,  the  nonhyperbolic  cauchy  problem 
for  the  wave  equations,  etc.  (see  John  [93],  Lavrentiev  [111],  and 
Mikhlin  [133]  for  an  elucidation  of  some  of  these  examples).  These 

include  problems  from  potential  theory,  hydrodynamics,  magnetohydrodynamics,  etc. 

(2)  Problems  arising  from  geophysics,  atmospheric  studies,  meteorology, 
reservoir  engineering, seismology,  etc.  The  bibliography  contains  numerous 
references  on  such  applications. 

(3)  Problems  of  control  of  a  system  governed  by  partial  differential 
equations  where  the  control  appears  on  the  boundary,  and  boundary-value 
problems  with  overabundant  data  on  one  part  of  the  boundary  and  insufficient 
data  on  the  rest  of  the  boundary. 

2.  SOME  APPROACHES  TO  THE  INVESTIGATION  OF  ILL-POSED  PROBLEMS.  The  lack 
of  a  continuous  dependence  of  the  solution  in  an  ill-posed  problem,  or  the 
lack  of  uniqueness  (or  even  of  the  existence  of  a  solution  in  the  classical 
sense)  make  direct  investigation  (and  particularly  approximation)  of  ill- 
posed  problems  difficult.  The  "regularization"  of  such  problems  has  been 
cited  by  Bellman  as  one  of  the  important  concepts  of  modern  analysis. 

Intuitively  speaking,  what  this  means  is  to  analyze  an  ill-posed  problem 
via  an  analysis  of  a  well-posed,  or  a  sequence  of  well-posed  problems, 
provided  this  analysis  gives  a  suitable  approximation  to  the  given  problem. 

This  suggests  several  approaches  which  are,  generally  speaking,  based  on 

(a)  a  change  of  the  concept  of  a  solution; 

(b)  a  change  of  the  spaces  in  question; 

(c)  a  change  of  the  operator  itself; 

(d)  the  concept  of  a  regularizer  or  "regularization  operators"; 

(e)  probability  or  well-posed  stochastic  extensions  of  ill-posed 
problems. 
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These  intuitive  ideas  manifest  themselves  explicitly  in  several 
approaches  which  are  available  at  the  present  for  the  investigation  and 
approximation  of  ill-posed  problems; 

(1)  The  earliest  approach,  due  to  Tikhonov  [188), is  based  on  the 
assumption  that  there  exists  a  prior  information  restricting  the  class  of 
solutions  to  a  compact  set  U.  In  this  case  the  operator  equation  is  well- 
posed  if  the  operator  A  is  considered  as  a  map  from  U  onto  A(U).  An 
investigation  of  the  various  problems  which  are  amenable  to  this  approach 
was  carried  out  by  Lavrentiev  [111],  John  [92]  and  others  (see  the  biblio- 
graohy  in  [111]). 

(2)  Another  approach  is  based  on  changing  the  notion  of  a  "solution" 
of  a  problem  (for  instance  to  "quasisolution",  Ivanov  [81];  a  brief 
exposition  of  this  approach  is  given  in  the  monographs  of  Holmes  [77]  and 
Lavrentiev  [111];  or  to  the  notion  of  least  squares  solution  of  minimal 
norm  in  the  case  the  underlying  spaces  are  Hilbert  spaces;  see  for  instance 
Nashed  [144],  Kammerer  and  Nashed  [95],  the  latter  approach  lends  itself 
readily  to  mathematical  programming. 

(3)  The  use  of  regularizing  parametric  operators  introduced  by 
Tikhonov  [188],  and  further  explored  by  Bakusinskii  [11],  [12],  Nashed 
and  Wahba  [200],  and  others. 

(4)  Another  approach  closely  related  to  (3)  is  that  of  replacing  the 
operator  equation  by  a  stable  minimization  problem  depending  on  a  parameter 
(see  Bellman,  Glicksberg  and  Gress  [22],  [23],  Phillipn  [162],  Ribiere 
[166]  and  others). 

(5)  Stochastic  and  probabilistic  approaches  involve  questions  of 
measurement  of  error  and  distrubances ,  and  provide  well-posed  extensions 
of  ill-posed  problems  (see  for  instance  Franklin  [58],  Lavrentiev  [111] 
and  Bakusinskii  [13]). 

(6)  The  method  of  quasirevisibility  of  Lattes  and  Lions  [107]  is 
based  on  the  idea  of  modifying  the  differential  or  integrodifferential 
operators  arising  in  boundary-value  problems  and  unstable  control  and 
optimization  problems,  in  order  to  impart  regularity.  This  approach  is 
closely  connected  with  some  of  the  preceding  approaches. 

(7)  A  new  approach  to  regularization  based  on  the  notion  of 
pseudosolution  (least  squares  solution  of  minimal  norm)  in  reproducing 
kernel  Hilbert  spaces  has  been  proposed  and  investigated  recently  by 
Nashed  and  Wahba  [200],  [147],  This  approach  coincides  in  philosophy 
with  some  of  the  approaches  mentioned  above  (in  the  sense  that  the  notion 
of  a  solution  is  changed  and  the  problem  is  considered  in  new  spaces), 
even  though  it  differs  sharply  in  technical  details.  In  this  approach  the 
geometry  of  reproducing  kernel  Hilbert  spaces  is  exploited  (in  an  optimal 
way),  and  the  results  obtained  are  the  best  possible  in  this  context. 

Some  of  the  above  approaches  are  easily  carried  out  on  computing 
machines  thereby  providing  effective  methods  for  the  numerical  solution 
of  a  wide  class  of  problems.  This  remark  applies  in  particular  to  the 
approaches  (2),  (3),  (4),  (6),  and  (7). 
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One  of  the  more  important  problems  that  a  chemist  or  chemical 
engineer  may  encounter  is  the  determination  of  the  mechanism  and/ 
or  model  for  a  chemical  reaction  system.  This  problem  may  be  en¬ 
compassed  within  the  broader  area  of  identification  [30-32]  which 
ranges  from  the  one  extreme  where  no  a  priori  information  regard¬ 
ing  the  system  representation  is  known,  to  the  other  extreme  where 
much  is  known  about  the  system.  The  development  of  a  suitable 
chemical  kinetic  representation  for  the  experimental  reaction  system 
lies  somewhere  between  these  two  extremes.  This  results  from  the 
fact  that  there  usually  exist  certain  postulated  models  to  represent 
experimental  kinetic  data;  however,  the  determination  of  the  best 
constants  in  these  mechanistic  models  and  perhaps  the  discrimina¬ 
tion  among  a  number  of  alternative  models  usually  remain  as  vexing 
questions. 

We  shall  adopt  the  view  here  that  for  any  given  reaction  system 
there  exists  a  large  but  finite  number  of  permissible  mechanisms, 
i.e.,  mathematical  models  which,  on  an  a  priori  basis,  could  de¬ 
scribe  the  reaction.  While  this  number  may  be  formidable,  the  in¬ 
vestigator  can  usually  focus  his  attention  on  only  those  selected  mod¬ 
els  which  he  considers  appropriate  to  analyze  in  detail.  The  results 
of  this  analysis  produce  one  or  more  plausible  models  which  are  ad¬ 
equate  descriptions  of  the  experimental  data.  If  there  is  only  one 
such  model,  the  investigator  might,  after  further  consideration,  ac¬ 
cept  this  as  the  correct  model.  More  often  than  not,  there  are  sev¬ 
eral  models  which  are  adequate.  These  must  be  subjected  to  further 
study,  perhaps  along  with  other  models  not  previously  studied. 
Through  this  additional  study,  including  more  extensive  experimen¬ 
tation  and  other  independent  information,  the  investigator  hopes  to 
be  able  to  discriminate  among  alternative  models  to  find  the  correct 
m  )del,  assuming  one  exists. 

The  difficulties  of  this  analysis  are  numerous.  They  rest,  for  ex¬ 
ample,  on  the  proper  design  of  experiments  which  is  used  to  generate 
the  data.  Box  and  co-workers  (8-14)  and  Blakemore  and  Hoerl  [5] 
have  emphasized  the  need  for  appropriately  designed  experiments 
since  the  damage  of  poor  design  is  irreparabl .  and  may  negate  the 
subsequent  analysis  no  matter  how  ingenious  this  analysis.  Assum¬ 
ing  several  models  satisfy  all  the  features  of  the  analysis,  the  selec¬ 
tion  of  an  adequate  model  can  be  made  only  tentatively.  Further  ex¬ 
perimentation  must  be  conducted  or  independent  information  brought 
to  bear  to  attempt  to  discriminate  among  competitive  models  and 
thus  to  determine  that  model  which  best  fits  the  data: 

In  this  context,  one  can  recognize  the  process  of  kinetic  investi¬ 
gation  as  an  iterative  one  in  which  experimentation  and  a  proposed 
model  lead  to  data  analysis,  which  in  turn  leads  to  further  experi- 
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mentation.  This  combined  approach  continues  in  a  presumably  con¬ 
verging  cycle  between  analysis  and  experimentation  toward  a  choice 
of  the  most  adequate  model.  Since  the  investigation  cannot  be  exhaus¬ 
tive  in  examining  all  possibilities,  no  proof  exists  that  the  correct 
model  has  been  found.  The  investigator  must  be  satisfied  that  he  has 
determined  only  the  most  adequate  representation  of  the  experimental 
kinetic  data. 

It  is  apparent  that  the  advent  of  the  modern  digital  computer  has 
greatly  enlarged  the  scope  of  methods  for  such  data  analysis  and 
model  construction  and  discrimination.  In  the  present  article  we 
shall  point  out  how  the  high-speed  capabilities  of  the  computer  can 
be  used  as  an  integral  part  of  the  overall  cyclic  procedure  mentioned 
above.  In  other  words  we  shall  show  how  the  computer  may  first  be 
used  to  construct  the  kinetic  models  (nonlinear  parameter  estimation) 
which  represent  in  a  statistical  sense  the  experimental  kinetic  data. 
Application  to  both  homogeneous  and  heterogeneous  kinetic  systems 
will  be  discussed  as  wellasa  variety  of  alternative  statistical  formu¬ 
lations.  It  will  then  be  shown  how  the  computer  can  be  used  to  spec¬ 
ify  the  experimental  conditions  for  obtaining  further  kinetic  data  so 
as  to  aid  the  discrimination  among  the  alternative  models  or  to  im¬ 
prove  the  accuracy  of  a  single  model. 


I.  KINETIC  AND  MODEL  DEFINITIONS 

To  characterize  the  models  which  we  shall  consider  in  this  arti¬ 
cle,  we  shall  first  specify  the  types  of  variables  and  then  the  forms 
of  equations  relating  the  variables,  and  we  shall  finally  show  how  the 
kinetic  equations  are  encompassed  within  this  formulation.  Thus  we 
define  the  following  types  of  variables: 

1.  Parameters.  These  are  constants  within  a  model  whose  numer¬ 
ical  values  are  unknown.  The  (column)  vector  of  p  parameters  is  de¬ 
noted  by  9  =  {6lt  02, ....  0p}. 

2.  Independent  Variables.  These  are  variables  which  are  either 
fixed  arbitrarily  for  each  experiment  or  which  are  known  precisely 
for  each  experimental  observation.  Typical  independent  parameters 
in  a  kinetics  experiment  might  be  the  reaction  time  or  the  space 
velocity.  The  (column)  vector  of  k  independent  variables  for  the  pth 
experiment  is  denoted  by  x^  =  {x^,,  x^2, ...,  x^}. 

3.  Dependent  Variables.  These  are  the  variables  which  the  model 
will  predict  on  the  basis  of  known  values  of  the  parameters  and  inde¬ 
pendent  variables.  Thus,  given  a  specified  set  of  parameter  values 
and  reaction  time,  the  model  will  predict  a  concentration  of  a  com¬ 
ponent  in  the  kinetic  model.  The  (column)  vector  of  n  dependent 
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variables  is  denoted  by  i)  72>  ••••  %}• 

4.  Observed  Variables.  Those  dependent  variables  which  are  ac¬ 
tually  measured  in  an  experiment  are  called  the  observed  variables. 
The  (column)  vector  of  r  observed  variables  for  the  4th  experiment 

is  denoted  by  yM  ={yMl,yM2 . y^r}- 

An  experiment  consists  of  the  measurement  of  all  observed  vari¬ 
ables  for  a  given  set  of  values  of  the  independent  variables. 

The  models  which  we  shall  consider  relate  the  dependent  variables 
to  the  independent  variables  and  the  parameters.  In  concise  form  we 
may  write  the  model  as 

q(*?4»X/i,0)  =0  i±  =1,2, m  (1) 

where  q  is  a  vector  of  functions  of  dimension  equal  to  that  of  the  de¬ 
pendent  variables  and  m  the  number  of  experiments.  An  alternative 
formulation  is  to  write  the  reduced  form 

Vv=f(xfI,0)  n  =  1,2 . m  (2) 

While  Eqs.  (1)  and  (2)  are  equivalent  mathematically,  care  must  be 
used  in  terms  of  computer  algorithms  since  parameter  estimates 
may  not  be  invariant  under  the  transformation  from  one  form  to  the 
other. 

Let  us  now  show  how  kinetic  rate  equations  fit  into  the  formulation 
of  Eqs.  (1)  or  (2).  Suppose  we  have  a  unidirectional,  unimolecular 
homogeneous  reaction  of  the  type 


For  isothermal  conditions  the  rate  equation  for  this  reaction  is 

^  =  -k[N]  (3) 

where  [N]  is  the  concentration  or  partial  pressure,  t  the  reaction 
time,  and  k  the  reaction  velocity  constant.  Integrating  this  equation 
with  a  given  initial  [N]  at  time  t  =  0,  {Nj,  we  have 

[N]  =  [NJ  exp(-kt)  (4) 

If  we  now  relate  V  ={[N]},  x  ={t,[No]},  and  ®  =  {k}  (the  subscript  m 
has  been  dropped  for  convenience),  we  see  that  Eq.  (4)  is  equivalent 
to  Eq.  (2).  Now  consider  the  isothermal  rate  equations  corresponding 
to  the  reaction 
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They  are 


d[A] 

sr 

dfB] 

dt 

m 

dt 


=  -k,[A] 


=  kjA]  -  kj[B] 

=  ka[B] 


When  initial  conditions  are  established  such  as 


[AJ  =  1,  [BJ  =  [CJ  =  0  t  =  0 


the  corresponding  integrated  forms  are  given  by 

[A]  =  exp{-kjt} 

[B]  =  [kjAkj  -  kj)]{exp{-k,t}  -  exp{-k,t}) 

[C]  =  1  -  [l/(kj  -  k,)] ( kj  exp{-k,t}  -  k,  expf-k^t}) 

Now  defining  the  vectors  T)  =  { [A] ,  [B] ,  fC] },  x  ={t},  and  0  =  {  k, , ltj } , 
we  see  that  we  have  once  again  the  form  of  Eq.  (2). 

A  feature  of  this  comparison  which  deserves  some  further  con¬ 
sideration  is  that  in  both  cases  the  rate  equations  can  be  integrated 
analytically.  Under  this  requirement  we  may  ascertain  the  resulting 
form  without  any  difficulty.  But  what  if  we  introduce  a  temperature- 
dependent  velocity  constant  or  nonunimolecular  reactions.  The  prob¬ 
ability  of  an  analytical  integration  decreases  to  zero  rapidly.  How¬ 
ever,  since  we  shall  always  consider  that  we  have  a  digital  computer 
available,  this  need  not  distress  us  greatly.  We  can  always  integrate 
the  set  of  rate  equations  numerically  to  yield  values  of  tj  at  any  se¬ 
lected  x.  Stated  in  another  way,  once  we  have  the  rate  equations  it  is 
in  a  sense  immaterial  whether  we  integrate  these  rate  equations  an¬ 
alytically  or  numerically.  The  end  result  of  either  path  is  a  func¬ 
tional  equation  of  the  form  of  Eq.  (2). 

Let  us  now  briefly  turn  to  a  heterogeneous-type  kinetic  system. 

As  an  example,  the  following  blmolecular  reversible  reaction  in  the 
gas-solid  phase  is  considered 

k, 

A  +  R  B  +  C 
k2 

If  the  controlling  mechanism  is  that  of  surface  reaction  on  dual  sites 
without  dissociation,  the  Langmuir-Hinshelwood  expression  becomes 
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d[A]/d(W/FA)=  r,  - 


B 


TOT 


where  the  dependent  variables  denoted  by  concentrations  are  normally 
expressed  as  partial  pressures,  the  independent  variable  is  recipro¬ 
cal  space  velocity,  W/FA,  and  the  parameters,  in  addition  to  those 
previously  encountered,  include  the  adsorption  equilibrium  coeffi¬ 
cients,  Ka,  KR,KB,and  Kc. 

As  compared  with  homogeneous  noncatalytic  reactions,  it  is  appar¬ 
ent  that  W/Fa  assumes  the  same  role  as  t,  that  the  numerator  in  the 
rate  expression  is  identical,  and  that  the  only  substantial  difference 
lies  in  the  introduction  of  the  denominator  term.  Thus,  in  functional 
notation,  if  we  define  7?  =  {[A] , [R] , [B] , [C]},  x  =  {w/FA,[Ao],[Rd,[B0],[C0]}, 
and  0  =  {k1,k2,KA,KR,KB,Kc} ,  we  have  the  equivalent  of  Eq.  (2). 

Let  us  close  out  this  section  with  one  further  example  to  illustrate 
the  models  which  can  be  encompassed  within  the  form  of  Eq.  (2).  Con¬ 
sider  that  we  have  three  components  in  a  homogeneous  reaction  and 
that  it  is  postulated  that  the  rate  equations  are 


^  =  -KW  +M3-M1 

=  -Mi72  +  Ms  +  Mi  (5) 

^  =  Mi  72  "  M3 


where  7j,,  rj2,  and  correspond  to,  e.g.,  [A],  [B],  and  [C]  used  pre¬ 
viously.  In  addition  we  have  a  set  of  unknown  initial  concentrations: 

7,  =  a 

72  =  b  t  =  0  (6) 

73  =  1  -  a  -  b 

At  time  t  (representing  the  /ith  experiment)  we  withdraw  three  sam¬ 
ples  from  the  reaction  mixture  and  perform  the  following  tests: 

a.  We  determine  7^  =  M/u)  directly  by  titration. 

b.  A  different  person  independently  determines  r i  by  titration. 

c.  We  determine  the  light  absorptivity  of  the  solution,  assuming 
this  to  be  a  linear  function  with  unknown  coefficients  of  the  concen¬ 
trations. 
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Denoting  the  results  of  these  three  measurements  as  y^,  y^2, 
and  y^s,  we  have 

y/ii  =  7/lu 

y  in  -  fyu  (?) 

yM3  =  Po  "*■  + 

Equations  (5) -(7)  together  constitute  the  model;  functionally  these 
are  related  by  the  equation 

yM  =  k(Vix,0)  =f(xM,fl)  (8) 

where  9  =  {a,b,k1,k2)k3,/30,^iI^j^3}  and  x^  ={tM}.  Equation  (8)  has 
a  slightly  different  form  than  Eq.  (2)  because  we  have  written  the 
left-hand  side  in  terms  of  the  observed  variables  and  not  the  model 
dependent  variables.  This  is  of  no  importance,  since  dependent  var¬ 
iables  which  are  not  observed  play  no  role  in  the  parameter  estima¬ 
tion  procedure. 

Thus  we  see  that  the  kinetic  models  under  consideration  in  this 
work  may  be  written  in  the  functional  form  of  Eq.  (1)  or  (2)  [or  (8)] 
and  that  all  kinetic  rate  equations  can,  in  one  form  or  another,  be 
encompassed  within  this  formulation. 


II.  PARAMETER  ESTIMATION 

Having  established  the  form  of  our  predictive  model  and  assuming 
that  experimental  data  are  available,  the  next  area  of  discussion  here 
is  to  outline  efficient  methods  for  determining  the  parameters  0 
such  that  the  model  fits  the  data.  Due  to  errors  in  the  measurements 
and  inaccuracies  in  the  model,  it  is  impossible  to  hope  foT  an  exact 
fit.  Instead  we  shall  try  to  find  values  of  9  which  minimize  some 
appropriate  measure  of  the  errors  involved.  This  is  parameter  es¬ 
timation.  Such  a  minimization  may  be  carried  out  in  a  number  of 
ways,  but  here  we  wish  to  point  out  the  prodigious  increase  in  our 
ability  to  do  this  estimation  via  computer  analysis. 

To  illustrate  the  available  procedures,  It*  us  rewrite  our  model 
equation  in  the  form 

yp  =f(xM,fi)  +  eM  (9) 

where  is  the  vector  of  errors,  or  residuals,  between  the  observa¬ 
tions  and  the  predicted  dependent  variables.  Parameter  estimation 
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then  tries  to  find  a  set  of  parameters  0  such  that  some  scalar  func¬ 
tion  of  the  errors  is  minimized.  In  general  we  shall  write  this  func¬ 
tion  as  S(0)  to  indicate  the  dependence  on  the  chosen  parameters. 

A.  Least-Squares  Minimization 


In  the  least-squares  approach  to  this  minimization  the  function 
S(0)  is  defined  by  the  sum  of  squares  of  the  errors  or 


SLs(0)  =  S  =  t,  [y/i  —  f(xAi,®)] 


(10) 


One  seeks  to  find  those  9  which  minimize  SLS(0);  the  resulting  “best” 
parameters  are  indicated  by  9- 

It  is  possible  to  recognize  two  cases  associated  with  Eq.  (10).  In 
the  first  case  the  parameters  9  enter  in  ffx^.fl)  in  a  linear  fashion 
(the  result  of  the  minimization  is  then  referred  to  as  linear  least 
squares);  in  the  second  case  no  such  linearity  occurs  and  we  have 
nonlinear  least  squares. 

1.  Linear  Least  Squares 


Linear  least  squares  is  also  known  as  multiple  linear  regression. 
The  approach  generally  treated  in  most  kinetic  books  deals  with  the 
very  special  case  where  the  parameters  enter  the  model  equations  in 
a  linear  fashion.  In  such  a  case  the  model  need  only  be  substituted 
into  Eq.  (10),  the  derivative  of  S (9)  taken  with  respect  to  9  and  the 
result  set  equal  to  zero.  This  yields  p  linear  algebraic  equations 
corresponding  to  the  p  parameters  (called  the  normal  equations).  By 
means  of  a  single  matrix  inversion  it  is  then  possible  to  solve  for 
the  best  §. 

If  the  are  independently  and  identically  distributed  errors  with 
zero  means,  then  by  this  procedure  the  9  are  efficient  linear  unbiased 
estimates  of  9.  In  other  words  the  parameter  estimates  which  mini¬ 
mize  the  sum  of  squares  of  the  residuals  will,  on  the  average,  equal 
the  true  values  of  the  parameters  (unbiased)  and  will  be  estimated 
with  maximum  precision  (minimum  variance). 

Unfortunately  this  method  has  a  number  of  decided  defects.  Fore¬ 
most  among  these  is  the  fact  that  few  kinetic  models  of  any  com¬ 
plexity  occur  in  the  desired  linear  form.  As  a  result  the  model  equa¬ 
tions  must  be  linearized  or  rearranged  to  be  handled  by  this  method. 
Thus  for  the  familiar  expression  [see  Eq.  (4)] 

V  =  exp{-kt} 
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a  conventional  logarithmic  transformation  leads  to 
rf  =log  77=  —  kt 

which  is  linear  in  the  parameter  k  and  is  therefore  amenable  to 
treatment  by  linear  least  squares.  However,  if  the  corresponding 
error  equation  for  the  original  expression  is  given  by 

yM  =  exp{-ktM}  + 

it  is  apparent  that  the  logarithmic  transformation  does  not  preserve 
the  original  distribution  of  errors  which  might  have  been  judged  ap¬ 
propriate. 

This  situation  is  further  aggravated  when  a  temperature -depen¬ 
dency  condition  is  introduced  through  the  Arrhenius  relation.  Then  a 
two-step  estimation  process  with  logarithmic  transformation  for 
each  step  is  required. 

From  log  77  =  —  kt,  estimate  k 

From  log  k  =  In  k„  -  (E/R)(l/T),  estimate  k„  and  E 

A  similar  difficulty  ensues  when  heterogeneous  catalytic  reactions 
represented  by  the  Langmuir-Hinshelwood  relations  are  studied  by 
linear  least  squares.  Thus  suppose  the  model  of  a  heterogeneous  re¬ 
action  is  taken  as 

r^_  k'(pA~PRPs/Kequil)  (U) 

(1  +KApA  +  KRpR  +  Ksps)a 

where  the  pA,  ....  are  partial  pressures  and  KequilJs  a  known  con¬ 
stant.  Obviously  the  parameters  k\  KA,  KR,  and  Ks  do  not  appear 
linearly  in  the  equation.  If,  however,  the  equation  is  rearranged  to 
the  form 


[(PA-PRPs/Kequil)/rA]1/2  =  y  +  + 


KrPr  KsPs 
k'  W 


(12) 


or 


7?a  =  01  +0jPA+esPR  +  0HPs 


then  combinations  of  parameters  occur  linearly.  However,  now  the 
partial  pressures  appear  on  both  sides  of  the  equation  and  therefore 
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play  the  role  of  both  dependent  and  independent  variables.  Further¬ 
more,  the  errors  of  Eq.  (12)  are  minimized  and  not  the  errors  of 
Eq.  (11). 

As  a  result  of  such  transformations  the  method  of  linear  least 
squares  will  give  parameter  estimates  which  guarantee  neither  a 
best  prediction  of  reaction  rates  [the  left  side  of  Eq.  (11)]  nor  rea¬ 
sonably  satisfactory  extrapolations.  Furthermore,  in  models  involv¬ 
ing  several  competing  reactions,  linearization  is  usually  quite  im¬ 
possible. 

In  summary,  we  see  that  linear  least  squares  is  applicable  for 
parameter  estimation  in  only  very  special  cases  and  that  it  cannot 
be  prescribed  as  a  viable  method  except  under  special  circumstances. 

2.  Nonlinear  Least  Squares 

Since  the  linear  least-squares  approach  has  these  obvious  defects, 
it  seems  natural  to  turn  to  a  method  which  does  not  require  that  the 
model  equations  be  linear  in  the  unknown  parameters.  We  refer  to 
this  method  as  nonlinear  least  squares  since  the  parameters  may  oc¬ 
cur  in  any  fashion,  linear  or  nonlinear. 

In  a  broad-gauged  description  of  this  method,  one  attempts  to  mini¬ 
mize  S(0)  in  an  iterative  fashion  rather  than  in  a  single  step.  As  such, 
and  because  of  the  extensive  calculations  which  are  then  required,  the 
method  almost  certainly  requires  implementation  via  a  computer. 
Methods  for  performing  this  minimization  are  described  in  a  subse¬ 
quent  section. 

It  is  apparent  that  nonlinear  least  squares,  in  which  the  depen¬ 
dency  of  the  dependent  variables  on  the  parameters  is  essentially  unre¬ 
stricted,  obviates  many  of  the  difficulties  inherent  in  linear  least 
squares.  The  kinetic  rate  equations  are  used  in  the  form  originally 
proposed  without  imposing  arbitrary  transformations;  the  minimiza¬ 
tion  of  the  sum  of  squares  of  errors  is  appropriate  in  terms  of  un¬ 
transformed  variables.  Furthermore,  the  functional  rate  models  may 
be  formulated  explicitly  or  implicity,  either  as  integrated  or  differ¬ 
ential  rate  equations. 

Nonlinear  least  squares,  however,  does  not  overcome  all  problems 
inherent  in  parameter  estimation.  It  is  not  valid  when  several  vari¬ 
ables  are  observed  at  each  experiment;  it  does  not  make  sense  to  add 
together  sums  of  squares  in  Eq.  (10)  of,  e.g.,  pressures  and  temper¬ 
atures.  This  problem  may  be  overcome  by  assigning  a  weight  factor 
to  each  variable  and  minimizing  the  weighted  sum  of  squares,  i.e., 
instead  of  Eq.  (10)  use 

SwlsW  =  £  £  Wify^i  —  fi(x^,0)]!  (13) 

M=1  1=1 
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Unfortunately  one  usually  does  not  know  what  weights  to  assign,  i.e., 
what  numerical  values  of  Wj  to  use.  On  theoretical  grounds  these 
weights  should  be  inverses  of  the  variances  of  the  measurement  er¬ 
rors,  assuming  these  are  known.  This  problem  can  be  overcome  by 
the  maximum  likelihood  approach  to  be  described  below  which  en¬ 
ables  one  to  estimate  not  only  the  parameters  but  the  weights  as 
well. 

B.  Maximum  Likelihood 


The  maximum  likelihood  principle  offers  a  powerful  and  versatile 
tool  to  the  parameter  estimator.  A  price  must  be  paid,  however,  in 
that  explicit  assumptions  concerning  the  form  of  the  probability  dis¬ 
tribution  of  the  errors  must  be  made.  Once  the  form  has  been  as¬ 
sumed,  any  parameters  appearing  in  the  distribution  function  may  be 
estimated  along  with  the  unknown  parameters  in  the  model.  This  is 
really  as  it  should  be;  the  errors  in  the  observations  are  as  much  a 
part  of  physical  reality  as  are  the  reactions  being  observed.  Mean¬ 
ingful  parameter  estimation  can  proceed  only  if  we  provide  a  mathe¬ 
matical  model  not  only  of  the  reactions,  but  also  of  the  errors,  and 
a  probability  distribution  is  an  appropriate  mathematical  model  for 
the  errors. 

Suppose  we  assume  p(e,i//)  to  be  the  joint  probability  density  func¬ 
tion  of  all  the  errors  e  in  the  observed  variables.  Here  \Js  represents 
a  set  of  unknown  parameters  (e.g.,  means, variances)  which  appear 
in  the  formulas  for  the  distribution.  Given  any  values  for  the  model 
parameters  8,  we  substitute  the  residuals  y  -  f(x,0)  for  the  errors  e 
in  the  expression  for  the  probability  density.  This  yields  a  function 
depending  on  9  and  i p  (the  x  and  y  being  given  by  the  observations): 

L(0,^)  =p[y  -f(x,0),^] 

We  refer  to  L (9,\p)  as  the  likelihood  function,  The  maximum  likeli¬ 
hood  method  simply  consists  of  finding  those  values  of  6  and  \(/  which 
maximize  L.  For  reasons  of  convenience  one  usually  maximizes 
SML  =  This  is  clearly  equivalent  to  maximizing  L.  Thus,  we 

have 

Sml(0.<W  =  log  p[y  -  f(x,  $),<//]  (14) 

If,  as  is  frequently  assumed,  the  observation  errors  in  different  ex¬ 
periments  are  uncorrelated,  the  joint  probability  density  p  is  the 
product  of  the  individual  experiment  probability  densities  p^,  and 

m 

sMl(®.^)  =  E  log  Ppfyp  -  f(xp,®),^]  (15) 
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The  most  frequently  used  distribution  is  the  normal  distribution. 

If  the  errors  of  each  experiment  are  normally  distributed  with  zero 
means  and  covariance  matrix  V^,  then 


=  (2ir)'r/2(det  exp(-±eJv;2eM)  (16) 


where  det  V  means  the  determinant  of  the  matrix  V  and  the  super¬ 
script  T  means  the  transpose  of  e.  Substituting  Eq.  (16)  in  Eq.  (15), 
we  obtain 


t*  m  ^ 

SMl(0.Vm)  =  ~  log  2v  E  log(det  V4) 

L  Ji  1 

m 

E  eJV^eM 

n-i 


(17) 


where  we  have  written  for  yM  -  f ^  =  y^  -  r  is  the  num¬ 

ber  of  observed  variables  and  m  the  number  of  experiments. 

We  shall  now  apply  the  maximum  likelihood  principle  to  some 
special  cases  of  Eq.  (17): 

a.  Suppose  all  the  matrices  Vp  are  known.  The  only  nonconstant 
term  in  Eq.  (17)  is 


Epv/ilEp 


and  therefore  maximizing  Eq.  (17)  is  equivalent  to  minimizing: 

m 

S(«)  =  £  E^V^eM 

=  E  £  E  [yMi  -  fi(xp,«)][ypi  -  tyx^,®)]^  (18) 

t=i  j=i  ^ 

wh  V-y  is  the  i,j  element  of  the  matrix  V^1.  This  is  the  most  gen¬ 
eral  form  of  weighted  least  squares. 

b.  Suppose  the  covariance  matrices  for  all  experiments  are  iden¬ 
tical,  i.e.,  Vj  =  Va  =  ...  =  V.  Then  Eq.  (17)  becomes 


SMl(*.V)  =-^  log27r  -  y  log(det  V)  E  (19) 

Consider  first  the  case  where  all  errors  are  independent.  This  is 
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equivalent  to  saying  that  the  matrix  V  is  diagonal.  Let  vi  =  Vjj  be 
the  ith  diagonal  element  of  V.  Then  Eq.  (19)  becomes 

SMi(«,V)  -  ^  log  2tt  -  ^  i:  log  Vi  -  f  t  t  tfivf1  (20) 

4  c  i  =1  li  =  l  1  =  1 

If  the  Vj  are  known,  we  minimize 

w  r 

fj  £  £^ivi 1 

M  =  1  1=1 

the  usual  form  of  weighted  least  squares.  If  the  V|  are  unknown,  we 
maximize  Eq.  (20)  first  with  respect  to  the  Vj  by  setting  (9S/8vj) 

=  0.  Thus 


Solving  for  Vj  we  find 

Vi  =  ^  L  £p i  (21) 

m  |i-l 

and  substituting  Eq.  (21)  into  Eq.  (20), 

Sml(0)  =  -TT  log  2*  -  Y  £  log  ;E(Ji 

L  L  \-\  m  jj-1 

r  m  m 

-lEW^il^i  (22) 

1  =  1  M  =  1  11=1 

rm„  .  2j7,  m  A  ,  P,  . 

T(l  l0g 

Maximizing  Eq.  (22)  is  equivalent  to  minimizing 

r  m 

S(6)  =  E  log  £  (23) 

1=1  n=i 


This  may  be  regarded  as  solving  a  weighted  least-squares  problem 
with  unknown  weights.  Once  the  maximizing  values  of  8  have  been 
found,  the  unknown  weights  can  be  estimated  from  Eq.  (21). 
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c.  Analogously,  it  can  be  shown  [24]  that  if  V  is  an  unknown,  non¬ 
diagonal  matrix,  maximizing  Eq.  (19)  is  equivalent  to  minimizing 

S(fi)  =  log  (det  M)  (24) 

where  M  is  the  moment  matrix  of  the  residuals 

m 

Mij  =  E  Euieuj 

!*  =  1 

and  V  is  estimated  by 

V  =  1  M  (25) 

m 

If  the  observed  variables  are  linearly  dependent  (as  concentrations 
are  related  in  material  balances),  the  matrix  M  will  be  nearly  singu¬ 
lar  and  minimization  of  Eq.  (24)  will  be  difficult,  if  not  impossible. 

It  is  thus  recommended  that  Eq.  (23)  be  used  instead  of  Eq.  (24)  in 
such  cases.  Alternatively  a  linearly  independent  subset  of  the  ob¬ 
served  variables  may  be  chosen,  and  “pseudo-observations”  for 
these  may  be  calculated  by  linear  regression  from  the  totality  of  ob¬ 
served  variables  for  each  experiment.  These  pseudo-observations 
may  then  be  used  safely  in  Eq.  (24). 

C.  Bayesian  Estimation 


It  occurs  frequently  that  even  before  we  start  estimating  the  param¬ 
eters  from  current  data,  some  information  concerning  the  values  of 
the  parameters  is  available  from  previous  experiments,  or  from  gen¬ 
eral  physical  considerations.  It  is  frequently  possible  to  summarize 
this  a  priori  information  in  the  form  of  a  relative  probability  density 
function  po(0).  If,  e.g.,  p0  {0,)/po  (S2)  =  10,  this  is  interpreted  to  mean 
that  we  think  9  =  (^  to  be  10  times  more  likely  than  9  =  62.  We  refer 
to  p0(6)  as  the  prior  distribution.  The  following  are  typical  examples: 

a.  A  reaction  rate  constant  must  be  positive.  Hence  p0(k)  =  0  for 

k  2  0.  If  we  have  no  further  information  on  k,  we  set  p0(k)  =  c  (a  posi¬ 
tive  constant  whose  value  is  immaterial)  for  k  >  0.  Computationa' 
convenience  often  requires  that  an  upper  bound  a  be  prescribed  for  k. 
In  this  case  p0(k)  =  0  also  for  k  >  a. 

b.  The  ratio  K  of  forward  and  reverse  reaction  rate  constants  may 
have  been  estimated  as  being  K<j±  a  from  measurements  of  equilibrium 
concentrations.  It  is  then  reasonable  to  use  the  normal  prior  distri¬ 
bution 
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When  we  come  to  estimate  the  parameters  on  the  basis  of  the  new 
data,  we  take  the  prior  information  into  account  by  multiplying  the 
likelihood  function  by  the  prior  distribution.  In  terms  of  the  loga¬ 
rithms,  we  maximize 

S =  logp[y  -  f(x,tf),^]  +  log po(0)  (26) 

In  case  (a)  above,  logp0  is  negatively  infinite  outside  the  pre¬ 
scribed  bounds.  Maximizing  Eq.  (26)  then  consists  of  finding  the 
maximum  of  logp  subject  to  the  constraints,  e.g.,  0  ^  k  £  a. 

Sequential  estimation,  i.e.,  the  reestimation  of  the  parameters 
after  the  results  of  each  in  a  series  of  experiments  become  avail¬ 
able,  is  an  important  application  of  the  Bayesian  technique.  Suppose 
after  N  experiments  the  parameters  are  estimated  to  be  0  =  ffN,  with 
covariance  matrix  VN  (the  matrix  is  contained  in  the  output  of  most 
parameter  estimation  programs).  The  relevant  information  con¬ 
tained  in  the  data  from  these  experiments  may  be  summarized  in 
the  posterior  distribution  pN  (0)  »c  exp[--£(0  -  8n)tVn1(0  -  0N)],  where 
c  is  an  irrelevant  constant.  When  results  of  subsequent  experiments 
become  available,  the  likelihood  function  is  constructed  from  these 
alone,  and  pN(0)  is  used  as  the  prior  distribution. 

One  cannot  overstate  the  importance  of  using  all  available  prior 
information,  in  the  form  of  either  constraints  on  the  parameters  or 
of  prior  densities.  Use  of  such  information  frequently  spells  the  dif¬ 
ference  between  convergence  and  nonconvergence  of  the  estimation 
procedure. 


III.  METHODS  OF  NUMERICAL  SOLUTION 

In  the  preceding  section  the  parameter  estimation  problem  was 
formulated  as  that  of  finding  those  values  of  the  parameters  (possi¬ 
bly  subject  to  constraints)  which  minimize  (or  maximize)  a  certain 
objective  function,  whose  explicit  form  we  have  derived  for  a  num¬ 
ber  of  cases.  We  now  describe  a  number  of  numerical  methods 
which  are  suitable  for  finding  the  minimum  [or  maximum;  maxi¬ 
mizing  F(x)  is  equivalent  to  minimizing  -F(x)]  of  the  objective  func¬ 
tion.  We  restrict  out  attention  mainly  to  unconstrained  minimization, 
but  where  constraints  exist,  they  may  be  incorporated  in  the  objec¬ 
tive  function  by  methods  discussed  below.  In  the  Appendix  we  detail 
certain  computer  programs  which  incorporate  the  features  to  be  dis¬ 
cussed. 
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A.  Direct-Search  Methods 


Methods  for  finding  the  minimum  of  S(0)  that  do  not  require  the 
computation  of  the  derivatives  8S/8  0  are  known  as  direct-search 
methods,  in  contrast  to  gradient  methods,  which  require  derivative 
evaluations.  This  distinction  is  not  always  clean-cut  since  gradient 
methods  in  which  derivatives  are  computed  by  finite  difference  ap¬ 
proximations  may  be  regarded  as  direct-search  methods.  Among 
proper  direct-search  methods  one  may  mention  those  due  to  Hooke 
and  Jeeves  [36],  Rosenbrock  [64],  and  Powell  [61,62],  the  last  refer¬ 
ence  being  specific  to  least-squares  problems.  Box  [15]  reported 
particularly  favorably  on  the  two  Powell  methods.  The  results  are 
based,  however,  on  somewhat  unrepresentative  sample  problems. 

At  present  there  exists  no  conclusive  evidence  for  preferring  one 
method,  or  set  of  methods,  over  another.  Direct-search  methods 
have  the  obvious  advantage  of  not  requiring  differentiations,  and  they 
seem  to  perform  well  on  well-conditioned  problems.  In  difficult 
problems,  however,  precise  knowledge  of  the  derivatives  can  be  cru¬ 
cial,  and  gradient  methods  tend  to  be  more  reliable.  For  this  reason 
we  omit  any  detailed  description  of  the  direct-search  methods. 

B*  Gradient  Methods 


The  problem  under  investigation  here  is  to  find  the  minimum  of 
S(0).  It  may  help  the  reader  to  visualize  a  set  of  mountains  and  the 
need  to  locate  the  lowest  valley  within  these  mountains.  To  solve  the 
problem  we  proceed  in  an  iterative  sequence:  given  an  initial  value 
90  of  the  parameters,  we  seek  a  new  value  of  9X  which  is  nearer  the 
minimum,  in  the  sense  that  S]^)  <  S(0O).  Once  9y  has  been  obtained, 
we  proceed  to  find  92,  93, each,  in  turn,  having  the  property  of 
being  closer  to  the  minimum.  In  the  class  of  methods  which  have 
proved  successful  for  parameter  estimation,  the  formula  used  for 
finding  the  new  value  is 

9X=90-  ARg  (27) 

where  A  is  a  scaler,  R  a  matrix,  and  g  the  gradient  vector  of  S,  i.e., 
gi  =  8S/80J.  Gradient  methods  differ  from  each  other  in  the  choice 
of  R  and  of  A  and  we  shall  now  examine  both  items.  Before  doing  so, 
however,  we  see  that  R  when  it  premultiplies  the  vector  g  twists  g 
in  vector  space  to  produce  a  new  vector;  R  thus  determines  the  direc 
tion  to  go  from  90.  A,  being  a  scalar,  merely  defines  how  far  along 
this  direction  to  go  and  determines  the  length  of  the  step. 
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1.  Choice  of  Direction 


The  choice  R  =  I  (the  identity  matrix),  i.e.,  0y  =  B0  -  Ag,  consti¬ 
tutes  the  method  of  steepest  descent.  It  converges  very  slowly  in 
most  practical  problems. 

The  choice  R  =  Q'\  where  Q  is  the  Hessian  matrix  of  S  [i.e.,  Qjj 
=  (82S/80i80j)]  constitutes  the  Newton-Raphson  method.  It  performs 
beautifully  when  one  is  near  the  minimum  and  possesses  some  de¬ 
sirable  properties  elsewhere  (see  Greenstadt  [29])  but  suffers  from 
two  major  difficulties: 

1.  Except  near  the  minimum,  a  step  taken  along  the  Newton-Raph¬ 
son  direction  is  not  guaranteed  to  reduce  S(fi),  no  matter  what  value 
is  chosen  for  A. 

2.  The  method  requires  computation  of  the  second  derivatives  of 
S,  usually  a  laborious  procedure. 

These  difficulties  may  be  overcome  in  the  following  ways: 

1.  If  R  is  positive -definite  and  g  *  0,  then  one  can  show  that  S^) 

<  S(fi0)  for  sufficiently  small  A.  Unfortunately  Q  (and  therefore  Q*1) 
is  not  necessarily  positive -definite  away  from  the  minimum.  How¬ 
ever,  let  i± j,  ii2,  ....  tip  be  the  eigenvalues  of  Q,  and  let  vx,  v2 . vp 

be  the  corresponding  eigenvectors.  The  inverse  of  Q  may  be  com¬ 
puted  from  its  spectral  representation 

Qi)  =  £  ^vkivkj 

where  vki  is  the  ith  component  of  vk.  Greenstadt  [24,29]  has  recom¬ 
mended  that  one  define  R  by 

Rij  =  £  l^klvkivkj  (28) 

k=  l 

If  some  Mk  =  0»  we  replace  it  by  a  small  positive  number.  As  de¬ 
fined  by  Eq.  (28),  R  is  positive-definite  and  coincides  with  Q'1  where 
the  latter  is  also  positive-definite.  We  thus  retain  all  the  advantages 
of  the  Newton-Raphson  method,  at  the  cost  of  having  to  compute  the 
eigenvalues  and  vectors  of  Q,  instead  of  simply  solving  the  set  of 
simultaneous  linear  equations  Q A0  =  -g,  This  cost  is  often  a  small 
one,  since  the  computation  of  S(0)  itself  is  the  major  time-consuming 
operation. 

An  alternative  solution,  suggested  by  Levenberg  [53],  Marquardt 
[54],  and  Goldfeld  et  al.  [28]  is  to  make  R  =  (Q  +  i/IP.  where  I  is  the 
identity  matrix,  If  v  is  greater  in  magnitude  than  the  largest  negative 
eigenvalue  of  Q,  the  matrix  R  will  be  positive-definite.  The  required 
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value  of  v  may  be  determined  by  trial  and  error,  without  actually 
computing  the  eigenvalues  of  Q. 

2.  It  is  fortunately  possible  in  most  parameter  estimation  prob¬ 
lems  to  obtain  a  reasonable  approximation  to  the  second-derivative 
matrix  Q  without  computing  any  second  derivatives  of  the  model 
equations.  We  illustrate  this  point  by  means  of  the  least-squares 
criterion,  where 

s(0)  =  Yj  =  £  [y/i  -ffrfx,#)]2 

*i  =  l  M  =  1 

r-2£*ft  (fM=f(x^)] 


Qij 


=  _^s_  =  _2  vE  +  2 

a0i30j  M  aeta^j  aflj 


(29) 


If  the  fit  of  the  model  to  the  experimental  data  is  at  all  good,  the 
t/j,  will  be  small,  and  the  first  term  of  Eq.  (29)  will  be  negligible 
compared  to  the  second.  Thus,  we  replace  Q  by  the  approximation 
Q*  (neglect  second-order  terms): 


Q* 

ij 


afu  3f 


=  2£Z!i 
j ft  301 


/i 


a  0- 


(30) 


Using  R  =  (Q*)'1  constitutes  the  Gauss-Newton  method.  An  alterna¬ 
tive  interpretation  of  the  method  in  to  view  it  as  replacing  the  model 
equations  by  their  tangents  [i.e.,  neglecting  (82f^/a 0j )]  and  solve 
the  resultant  linear  regression  problem  to  obtain  the  starting  point 
for  the  next  iteration. 

The  same  treatment  can  be  applied  to  weighted  least  squares  and 
to  more  general  maximum  likelihood  problems.  In  all  cases  dis¬ 
cussed  in  the  previous  section,  it  can  be  shown  that  in  the  expression 
for  (a2S/a0i30j),  the  quantities  (32f^  8 0j )  are  always  multiplied 

by  some  residual  and  may  therefore  be  neglected.  Thus  the  first 
derivatives  of  the  model  functions  suffice  to  determine  an  approxi¬ 
mation  Q*  for  the  matrix  Q. 

An  alternative  approach  to  approximating  Q  without  calculating 
second  derivatives  is  contained  in  the  Davidon-Fletcher-Powell 
[20,25]  method.  We  start  with  an  arbitrary  symmetric  positive-def¬ 
inite  matrix  R^.  Let  0j  and  gj  denote,  respectively,  the  values  of  the 
parameters  and  of  the  gradient  of  S  at  the  ith  iteration.  Then  0j+1 
=  0j  -  A-iRigi,  with  chosen  so  as  to  minimize  S(0jtl )  along  the 
chosen  direction.  Let  =  0jt,  -  flj  and  y\  -  gin-  gi.  Then 
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Ri+1  -  Ri  - 


Rmn  Rj 
nTRm 


For  a  well-behaved  function  S(0).  f'ie  sequence  of  matrices  Rj 
(i  =  0, 1,  2, ...)  converges  to  Q'1  iht  i*-ecise  conditions  which  S(0) 
must  satisfy  for  the  above  statement  to  ue  true  are  not  known.  How¬ 
ever,  it  can  be  shown  that  in  principle  the  Rj  are  positive-definite 
as  long  as  no  Xj  vanishes.  Numerical  difficulties,  however,  require 
that  the  initial  matrix  Rj,  approximate  Q"1  at  least  in  the  magnitude 
of  its  diagonal  terms. 

In  tests  on  various  kinetics  models,  Bard  [2]  has  found  the  gen¬ 
eralized  Gauss-Newton  method  to  be  somewhat  more  efficient  than 
its  competitors. 


2.  Choice  of  Step  Length 

In  Davidon’s  method,  X  is  chosen  so  as  to  minimize  S(0)  along 
the  chosen  direction.  However,  experience  has  shown  that  the  ac¬ 
curacy  with  which  X  needs  be  determined  is  to  about  one  part  in  a 
thousand. 

In  the  original  Newton-Raphson  method,  X  =  1.  This  is  usually 
not  a  satisfactory  procedure,  except  very  near  the  minimum.  In 
Marquardt’s  method  also,  X  -  1  is  used.  If  it  turns  out  that  S(®x) 
s  S(0O),  the  value  of  u  is  increased  and  0S  is  recomputed,  until  S^) 

<  S(0O). 

In  the  Gauss-Newton  method  we  initially  set  X  =  1  (if  there  are 
constraints,  a  smaller  initial  value  may  be  required  to  guarantee 
that  0t  is  in  the  feasible  region).  If  S(0O  -  XRg)  ^  S(0„),  we  choose  a 
smaller  value  of  X  [e.g.,  X/2,  or  the  value  that  would  minimize  a 
parabolic  approximation  to  S(0)  based  on  its  computed  values  at 
X  =  0  and  X  =  1,  and  on  its  gradient  at  X  =  0].  We  repeat  until 
S (0O  -  XRg)  •<  S(0O),  which  will  always  hold  for  sufficiently  small  X. 

In  the  Newton-like  methods  discussed  above  (Davidon’s  method  ex¬ 
cepted),  it  seems  that  an  extensive  search  for  the  value  of  X  that 
minimizes  S(0O  -  XRg)  is  not  justified.  It  rarely  pays  to  test  more 
than  one  additional  value  of  X  beyond  the  first  one  to  give  an  improve¬ 
ment. 

3.  Initiation 

Equation  (27),  combined  with  the  choices  of  R  and  X,  defines  the 
iterative  procedure.  To  start  the  procedure,  however,  we  need 
initial  estimates  or  guesses  for  0O.  The  success  of  the  minimization 
procedure  often  depends  on  these  guesses,  and  it  is  clearly  advan- 
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tageous  to  make  these  guesses  as  close  to  the  true  values  (the  mini¬ 
mum)  as  possible.  Several  methods  for  arriving  at  reasonable  initial 
guesses  have  been  summarized  by  Kittrell,  et  al.  [45] .  Among  these 
are: 

1.  Use  all  available  prior  information. 

2.  Use  the  results  of  linear  -least-squares  estimates  based  on 
linearized  or  rearranged  equations. 

3.  Fix  the  values  of  some  of  the  parameters  (e.g.,  those  on  which 
most  prior  information  is  available)  and  estimate  only  the  other  pa¬ 
rameters.  Use  these  results  as  initial  guesses  for  estimating  all  pa¬ 
rameters  simultaneously. 

4.  Compute  the  objective  function  on  a  sparse  grid  of  parameter 
values.  Use  the  grid  point  with  optimal  objective  function  value  as 
the  initial  guess. 

5.  Use  an  analog  computer  to  simulate  the  reaction.  Search  for 
the  optimum  by  turning  the  analog  elements  which  set  the  parameter 
values.  This  method  may  also  be  combined  with  the  grid  search  des¬ 
cribed  above. 

If  the  number  of  unknown  parameters  is  too  large  to  make  a  grid 
search  feasible,  it  is  possible  to  conduct  a  random  search  instead. 

None  of  these  methods  are  infallible,  and  which  one  will  work  best 
in  any  given  situation  can  be  determined  only  by  experience. 

4.  Termination  and  Convergence 


Once  the  iterative  procedure  has  been  started,  it  will  continue  to 
run  on  the  computer  until  some  termination  criterion  is  satisfied. 

At  the  minimum  of  S,  the  gradient  g  should  vanish.  Due  to  round-off 
errors,  however,  the  condition  g  =  0  can  never  be  attained  precisely, 
and  cannot  be  used  as  a  termination  criterion.  In  practice  it  seems 
preferable  to  terminate  whenever  the  iterative  procedure  ceases  to 
cause  significant  changes  in  the  0’s,  i.e.,  (as  suggested  by  Marquardt 
[54])  when 

|01-0o|(e1  +  |0O|)'1  £  t2 

for  all  elements  of  0,  where  and  ea  are  predetermined  vectors  with 
small  elements.  It  will  be  found  that  frequently  even  very  close  to 
the  minimum  the  gradient  still  has  large  elements.  A  good  test  of 
stationarity  is  then  given  by  the  dimensionless  quantities  | gi/GiQii  | , 
which  are  all  required  to  be  small  compared  to  unity.  If  the  matrix 
Q  is  available,  positivity  of  all  its  eigenvalues  indicates  that  the  so¬ 
lution  (if  stationary)  is  indeed  a  (local)  minimum.  There  is  usually 
no  way  of  proving  that  a  global  minimum  has  been  reached. 


202 


KINETICS  ANALYSIS 


87 


Failure  to  converge  to  a  stationary  point  is  usually  due  to  one  of 
the  following  reasons- 

1.  A  constraint  has  been  reached.  If  the  constraint  is  one  about 
which  there  is  no  doubt  (e.g.,  the  condition  that  a  rate  constant  must 
be  positive),  it  is  likely  that  the  mechanism  chosen  is  Inappropriate. 

2.  The  matrix  R  is  not,  or  is  insufficiently,  positive-definite.  We 
have  described  above  how  this  problem  may  be  overcome. 

3.  The  termination  criterion  <s  not  stringent  enough.  Vectors  of 
and  c,  with  elements  of  10"s  and  ID"4,  respectively,  have  worked 

well  in  many  problems  but  may  fail  with  others. 

4.  The  gradient  is  computed  with  insufficient  precision.  When¬ 
ever  possible,  gradients  should  be  computed  by  analytic  differentia¬ 
tion,  rather  than  by  means  of  finite  difference  approximations. 

We  note  that  in  kinetics  analysis  the  model  equations  are  generally 
given  in  the  form  of  solutions  of  differential  equations,  e.g., 

§&-gi<W>  1-1,2 . n  (32) 

The  question  then  is  how  to  obtain  analytic  derivatives  of  these  so¬ 
lutions.  Differentiating  Eq.  (32)  with  respect  to  6)j,  we  have,  using 
the  chain  rule, 

—  ^  villas  + 

dt  90j 

Interchanging  the  orders  of  differentiation  yields 

£  £?i  "  3gi/377k\  |  9gj 

dt  90j:  M j 

Equation  (33)  is  a  set  o  linear  ordinary  differential  equations  in  the 
unknown  functions  a??kA  9j  (k  =  1,  2,  ...,n;  j  =  1,2,  ...,p).  These  may 
be  integrated  numerically  alongside  the  original  Eqs.  (32)  to  obtain 
the  desired  values  of  at  any  time  t.  Although  one  can  scarcely 

call  these  “analytic  derivatives,”  they  will  be  determined  to  the  same 
accuracy  as  the  *7,  themselves,  and  that  is  the  best  one  can  hope  for. 
The  alternative  procedure  is  to  integrate  Eqs.  (32)  for  slightly  per¬ 
turbed  values  of  the  0’s,  and  estimate  the  drj^/ddj  by  finite  differences. 
The  total  number  of  integrations  performed  is  the  same  in  both 
methods,  with  the  “analytic”  one  usually  providing  greater  accuracy. 

C.  Constraints 

To  handle  constraints  within  the  framework  of  unconstrained  min¬ 
imization,  it  is  necessary  to  adjust  the  objective  function  in  such  a 


r 
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way  that  onf  navs  a  penalty  whenever  one  comes  near  to  violating  the 
constraint.  Carroll’s  method  [16]  uses  the  following  device:  Let  S(0) 
be  the  function  whos°  minimum  is  to  be  found  subject  to  a  set  of  con¬ 
straints  gt(fl)  a  0,  i  =  3, 2, ....  We  introduce  a  new  objective  function: 

S*(e)=S(0)+£J^  (34) 

where  the  <*i  are  suitably  chosen  small  positive  constants.  As  one 
approaches  say  the  jth  constraint,  gj  (0)  approaches  zero,  and  S*(0) 
increases  beyond  bounds.  At  a  point  far  from  any  constraint,  the 
function  S*(0)  differs  but  little  from  S(0),  and  the  minima  of  the  two 
should  nearly  coincide.  After  finding  the  minimum  of  S*(0),  one  may 
reduce  the  ai  by,  e.g.,  a  factor  of  10,  find  the  minimum  of  the  new 
S*(8),  and  repeat  the  procedure  until  the  contribution  of  the  “penalty 
function” 


becomes  negligible.  If  the  minimum  of  S (9)  is  actually  on  a  con¬ 
straint,  this  procedure  may  approach  the  true  minimum  as  closely 
as  one  wishes. 

In  many  special  cases  the  constraints  may  be  handled  by  other 
methods.  For  example,  if  it  is  required  that  8X  ~  0,  we  may  substi¬ 
tute  ij/2  for  and  minimize  with  respect  to  the  unconstrained  vari¬ 
able  xj/.  As  shown  by  Box  [15],  similar  substitutions  are  possible  in 
many  cases  but  often  require  considerable  ingenuity. 

D.  Interpretation  of  Parameter  Estimates 


The  best  estimated  values  of  the  parameters  are,  in  themselves, 
actually  of  little  use.  It  is  essential  to  know  not  only  what  the  esti¬ 
mates  are,  but  also,  and  more  importantly,  how  reliable  they  are. 
The  observations  are  random  variables,  and  hence  the  estimates 
which  are  computed  from  them  are  also  random  variables;  thus  it  is 
meaningful  to  try  and  estimate  their  probability  distribution.  For¬ 
tunately  it  turns  out  that  the  distribution  usually  approaches  the  nor¬ 
mal  as  the  number  of  observations  is  increased.  The  means  of  the 
estimated  distribution  constitute  the  estimated  values  of  the  param¬ 
eters;  the  covariance  matrix  of  the  distribution  is  a  measure  of 
the  reliability  of  the  estimates.  This  matrix  expresses  the  manner 
in  which  variations  in  the  observations  would  affect  the  parameter 
estimates. 
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The  alternative  approach  to  evaluating  the  reliability  is  by  ex¬ 
ploring  the  dependency  of  the  objective  function  on  the  parameters. 
If  the  objective  function  is  little  affected  by  charges  in  a  certain  pa¬ 
rameter,  one  would  have  doubts  concerning  its  value.  Since  at  the 
final  estimates  the  objective  function  has  a  minimum,  its  gradient 
vanishes  and  the  effect  of  the  parameters  on  the  objective  function 
is  summarized  in  the  second  derivative  matrix.  It  turns  out  that 
these  two  measures  of  reliability  are  equivalent;  in  fact  if  V  is  the 
covariance  matrix  of  the  estimates,  then 


iv- 


92  log  L 


where  L  is  the  likelihood  function. 

Most  parameter  estimation  programs  print  out  estimates  of  the 
matrix  V.  The  diagonal  elements  of  V  are  the  variances,  and  their 
square  roots  are  the  standard  deviations  of  the  parameter  estimates. 
Off-diagonal  elements  indicate  the  interdependence  of  the  estimates 
of  the  various  parameters.  It  is  convenient  to  eliminate  these  de¬ 
pendencies  by  finding  those  linear  combinations  of  the  parameters 
which  are  statistically  independent.  These  are  known  as  principal 
components.  Let  pi  be  an  eigenvalue  of  V  with  eigenvector  vj,  whose 
components  are  Vjj.  Then  the  estimate  of  the  quantity  =  £  Vi<0j 
has  variance  pj,  and  the  estimates  of  the  different  i/q  are  independent. 
The  \j/[  are,  then,  the  desired  principal  components.  Examination  of 
the  pi  will  reveal  which  linear  combinations  of  the  parameters  are 
well  determined  (small  pi)  and  which  are  doubtful  in  value  (large  pj). 

Once  again  we  point  out  that  the  Appendix  contains  a  discussion 
of  available  computer  programs  which  have  some  or  all  the  features 
discussed  in  this  section. 


IV.  IMPROVING  PARAMETER  ESTIMATES  BY 
PROPER  EXPERIMENTAL  DESIGN 

Examination  of  the  posterior  distribution  (i.e. ,  of  the  covariance 
matrix  of  the  parameter  estimates)  will  sometimes  reveal  that  some 
of  the  parameters,  or  some  linear  combinations  of  the  parameters, 
are  ill  determined.  We  may  distinguish  three  causes  for  such  an  oc¬ 
currence: 

1.  The  model  chosen  to  fit  the  data  is  inappropriate.  This  will  be 
marked  by  the  appearance  of  large  systematic  deviations  (residuals) 
of  the  experimentally  measured  from  the  predicted  values  of  the  ob¬ 
served  variables.  The  obvious  remedy  is  to  modify  the  model.  Those 
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terms  whose  parameters  are  the  most  ill  determined  should  be  prime  tar¬ 
gets  for  elimination  or  modification.  The  nature  of  the  deviations  of  the 
residuals  may  also  hold  clues  as  to  how  the  model  should  be  modified. 
Hunter  and  co-workers  [38, 39, 44]  have  provided  an  excellent  example  of 
how  residual  analysis  can  lead  to  a  systematic  modification  of  the  model. 

2.  The  measurement  precision  is  low.  This  will  be  characterized 
by  large  random  residuals.  If  no  improvement  of  the  measurement 
techniques  is  feasible,  the  only  remedy  is  to  make  more  measure¬ 
ments.  Unfortunately  a  tenfold  increase  in  precision  may  require  a 
100-fold  increase  in  the  number  of  experiments.  It  should  be  noted, 
however,  that  the  attainable  precision  in  the  estimates  for  a  given 
number  of  experiments  is  maximized  when  the  experimental  con¬ 
ditions  are  chosen  properly.  This  point  is  discussed  below. 

3.  The  experiments  were  not  properly  designed.  This  is  the  con¬ 
clusion  that  must  be  reached  if  some  parameters  have  a  large  vari¬ 
ance,  even  though  the  overall  fit  is  good,  i.e.,  even  though  the  resi¬ 
duals  are  small.  We  cite  two  examples  of  how  this  may  happen: 

a.  Suppose  species  A  is  supposed  to  decompose,  in  two  parallel 
reactions,  to  species  B  and  C,  with  rate  constants  kt  and  kj,  respec¬ 
tively.  If  measurements  on  the  concentrations  of  A  alone  are  avail¬ 
able,  it  is  clearly  possible  to  determine  kj  +  kj,  but  not  kx  and  k,  in¬ 
dividually.  While  this  example  may  appear  trivial,  similar  effects 
may  arise  and  be  less  obvious  in  more  complicated  situations. 

b.  Suppose  the  model  equation  is 

Y/i  =  01*41  + 

and  in  all  experiments  it  happened  that  x4,  ^x^,.  Then  the  equation 

Y4  =  (0,  +  02)kUi  =  t?3xMl  (35) 

would  represent  the  data  just  as  well  as  the  original  equation.  On  this 
basis  it  is  impossible  to  estimate  9l  and  $2  individually,  but  rather 
only  as  their  sum. 

In  both  these  cases  proper  planning  of  the  experiments  would  have 
eliminated  the  difficulties:  in  the  first  case,  by  measuring  concen¬ 
trations  of  B  and/or  C;  in  the  second  case,  by  varying  x^,  and  x^2 
independently. 

When  we  attempt  to  systematize  the  selection  of  appropriate  ex¬ 
periments,  we  are  naturally  led  into  the  realm  of  information  theory. 
This  follows  from  the  fact  that  the  purpose  of  an  experiment  is  to 
gain  information.  In  the  case  of  parameter  estimation  the  relevant 
information  is  contained  in  the  posterior  distribution  of  the  para¬ 
meters.  When  one  tries  to  formalize  intuitive  notions  concerning  the 
amount  of  information  contained  in  a  given  distribution  (e.g.,  a  dis- 
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tribution  with  a  small  variance  contains  more  information  concern¬ 
ing  the  value  of  a  parameter  than  one  with  a  large  variance)  one  is 
led  [1]  to  the  following  formula: 

I  -  E(log  p)  =  /p  log p  d0  (36) 

where  I  is  the  measure  of  information,  p  is  the  probability  density 
function,  and  E  denotes  the  expected  value.  If  p  is  a  multivariate 
normal  distribution  with  covariance  matrix  V,  then  the  information 
is  given  by 

I  =  C  log(det  V)  (37) 

whore  C  is  a  constant.  To  derive  the  maximum  amount  of  relevant 
information,  we  must  plan  our  experiments  so  as  to  minimize  the 
value  of  (det  V),  where  V  is  the  expected  variance  of  the  posterior 
distribution. 

Suppose,  now,  that  we  wish  to  plan  a  series  of,  e.g.,  m  experi¬ 
ments.  Our  current  information  on  the  values  of  the  parameters  is 
summarized  in  a  prior  distribution  po(0),  e.g.,  a  normal  distribution 
with  covariance  matrix  V0.  In  many  cases,  po(0)  will  actually  be  the 
posterior  distribution  obtained  by  estimating  the  parameters  on  the 
basis  of  experiments  conducted  to  date.  Then  it  follows  from  Eq. 

(37)  that  the  expected  covariance  of  the  posterior  distribution  is 
given  approximately  by 

[V"*]ij  -  (Vlij  +  tt  !U_1]kl  (38) 

<i=l  k,l  dyJ 

where  U  is  the  covariance  matrix  of  the  observations  for  each  exper¬ 
iment.  The  derivatives  af^k/dSj  are  to  be  evaluated  at  the  current 
estimates  of  the  0’s.  The  simplest  case  occurs  when  there  is  only 
one  observed  variable  per  experiment.  Then  the  subscripts  k  and  f 
may  be  omitted,  and  the  matrix  U  becomes  a  single  number  a*,  where 
a  is  the  standard  deviation  of  the  measurements.  Then  Eq.  (38)  be¬ 
comes 


ddi  99j 


(39) 


This  formula  was  derived  by  Draper  and  Hunter  [22]. 

Equation  (38)  or  (39)  reveals  that  (det  V)  is  a  very  complicated 
function  of  the  x^j,  i.e.,  of  the  experimental  conditions  proposed  for 
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the  m  desired  experiments.  Proper  design  of  these  experiments  re¬ 
quires  that  one  select  that  set  of  feasible  experimental  conditions 
xpi>  P  =  1,2,...,  m,  that  minimizes  (det  V)  [or  equivalently  maxi¬ 
mizes  (det  V'1)]. 

Most  of  the  work  in  this  field  to  date  is  contained  in  a  series  of 
papers  by  Box,  Hunter,  Kittrell,  and  their  co-workers  at  the  Univer¬ 
sity  of  Wisconsin  [12,14,21,47],  They  derive  the  relevant  formulas 
from  several  alternative  points  of  view  and  apply  them  to  several 
computer-simulated  chemical  reaction  models.  The  problem  of  find¬ 
ing  the  maximum  of  the  information  function  is  complicated  by  the 
fact  that  it  usually  possesses  multiple  local  maxima  and  that  its  de¬ 
rivatives  are  usually  too  complicated  to  calculate.  Considerable  ad¬ 
ditional  work  in  the  field  is  required. 


V.  COMPUTATIONAL  RESULTS  IN  PARAMETER  ESTIMATION 

In  the  previous  sections  we  have  developed  the  conceptual  ideas 
associated  with  nonlinear  least  squares  as  applied  to  kinetic  models. 
The  present  section  will  outline  some  of  the  important  results  which 
have  been  obtained  to  date  in  this  area.  The  depth  of  coverage  is 
minimal  but,  it  is  hoped,  sufficient  to  highlight  the  various  points. 

Some  of  the  earlier  work  in  the  area  of  parameter  estimation  in 
kinetic  systems  is  due  to  such  authors  as  Box  [8],  Box  and  Coutie  [9], 
Blakemore  and  Hoerl  [5],  Cull  and  Brenner  [18],  Hartley  [33],  Mar- 
quardt  [54],  Peterson  [57,58]  and  Rubin  [65].  In  particular  we  should 
comment  on  the  pioneering  works  by  Box  and  by  Peterson.  Both  these 
authors  pointed  out  the  positive  and  negative  aspects  of  nonlinear 
parameter  estimation  and  the  important  advantages  associated  with 
the  use  of  a  digital  computer  for  the  analysis.  While  Peterson’s 
work  was  directed  toward  the  computer  aspects,  Box  derived  and 
discussed  in  detail  all  the  statistical  features  and  developments.  In 
these  two  papers  are  contained  many  of  the  basic  ideas  for  work 
carried  out  to  date  in  the  nonlinear  estimation  area. 

Blakemore  and  Hoerl  analyzed  the  famous  or  infamous  hydrogen¬ 
ation  of  codimer  originally  analyzed  by  Hougen  and  Watson  [37]  via 
linear  least  squares  (for  a  further  discussion  see  a  later  part  of  this 
paper).  For  this  system,  at  least  20  alternative  models  may  be  pos¬ 
tulated  as  representing  the  rate -determining  step.  These  authors 
showed  that  it  was  impossible  to  select  any  one  model  as  being  the 
best  model  of  the  entire  set.  The  need  for  further  and  more  exten¬ 
sive  experimental  data  was  conclusively  shown,  thus  pointing  out  that 
once  a  fixed  set  of  experimental  data  is  available  the  most  sophisti¬ 
cated  analysis  of  this  data  may  be  inadequate  to  isolate  a  single 
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model.  Cull  and  Brenner  investigated  hexane  isomerization  kinetic  s 
via  the  nonlinear  approach.  In  this  case  they  were  able  to  conclude 
that  one  of  their  postulated  reaction  steps  was  the  rate-controlling 
steps  within  the  level  of  precision  of  the  data. 

In  a  slightly  more  recent  paper  Freeh  et  al.  [27]  analyzed  the  hy¬ 
drogenation  reaction  in  a  stirred  reactor.  By  selecting  that  model 
whose  standard  deviation  of  the  fit  to  experimental  data  was  lowest, 
they  were  able  to  isolate  a  single  model  as  representative  of  the  re¬ 
action  mechanism. 

In  the  last  few  years  much  of  the  nonlinear  parameter  estimation 
in  kinetics  systems  has  been  carried  out  by  two  groups  of  research¬ 
ers;  the  first  group  includes  Lapidus  and  Peterson  [52,59]  and  the 
second  group  includes  Kittrell,  Hunter,  Mezaki,  and  Watson  [42-49]. 

In  addition,  nonlinear  estimation  methods  have  been  applied  in  other 
allied  areas.  Thus  the  work  of  Heineken  et  al.  [34]  and  Bellman  et  al. 
[4]  deal  with  an  analysis  of  the  kinetics  of  biological  reactions  via 
some  of  the  computer  methods  discussed  here  (see  also  Refs.  [17]  and 
[63]  for  the  adaptive  control  of  a  batch  reactor). 

The  work  of  Peterson  [57,58]  is  worth  discussing  in  some  detail 
since  it  represents  a  concrete  case  where  nonlinear  estimation  was 
able  to  provide  new  insight  into  a  kinetics  mechanism.  The  integral 
conversion  data  of  both  D’Alessandro  and  Farkas  [19]  and  Ioffe  and 
Sherman  [40]  on  the  vapor-phase  homogeneous  and  noncatalytic  oxi¬ 
dation  of  naphthalene  were  analyzed.  The  first  study  concerned 
naphthalene  depletion  in  which  the  reaction  kinetics  were  free  of  ex¬ 
traneous  effects;  the  second  study  treated  the  mechanism  of  the  com¬ 
plete  reaction,  including  reaction  products,  and  revealed  certain 
complicating  factors  which  influence  the  behavior  of  the  reaction. 

D  Alessandro  and  Farkas  obtained  data  for  the  vapor -phase  oxida¬ 
tion  of  naphthalene  in  the  presence  of  a  catalyst  consisting  of  vana¬ 
dium  pentoxide  in  a  flow  reactor.  Measurements  were  made  directly 
on  total  anhydride,  maleic  anhydride,  1,4-naphthoquinone,  and  off¬ 
gases.  Phthalic  anhydride  was  calculated  by  difference  between  total 
and  maleic  anhydrides.  Residual  naphthalene  in  the  produce  stream 
was  determined  by  overall  difference.  Thus,  concentrations  of  naph¬ 
thalene  (N),  phthalic  anhydride  (P),  maleic  anhydride  (M),  naphto- 
quinone  (Q),  and  off-gases  (G)  constituted  the  dependent  variables. 

Since  no  measurement  errors  were  given,  equal  statistical  weights 
were  assigned  to  the  observed  values  of  concentration.*  Data  were 
recorded  for  a  given  temperature  at  various  reaction  times  ranging 
from  0.020-2.44  sec.  Temperatures  were  also  explored  from  340- 

*The  availability  of  maximum  likelihood  computer  programs  would  now 
make  such  an  assignment  unnecessary. 
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475°C.  Six  observations  were  made  on  each  chemical  species  at 
340°C,  five  at  375°C,  six  at  410°C,  four  at  450°C,  and  three  at  475°C. 

In  all,  24  observations  were  made  on  each  chemical  species. 

Naphthalene  depletion  behavior  was  analyzed  by  examining  each 
set  of  data  first  at  constant  temperature  to  determine  the  specific 
reaction  rate  at  that  temperature;  second,  the  kinetics  were  examined 
for  temperature  dependency  and  for  the  order  of  the  reaction.  The 
typical  rate  equation  for  the  second  case  is  given  by 

=  -  kofN]  “  exp  [-  E/RT]  (40) 

at 

which  when  integrated  yields  for  a  *  1: 

[[N0]‘- “/U  -  «)]  [l  -  ([Nj/fNo])1'"]  =  kt  (41) 


Typical  results  are  shown  in  Table  1,  indicating  satisfactory 
agreement  between  the  nonlinear  estimation  analysis  and  that  car¬ 
ried  out  by  D’Alessandro  and  Farkas;  this  agreement  holds  for  each 
temperature  as  well  as  over  the  entire  temperature  range. 

TABLE  1 


Constant- Temperature  Kinetics 


Temp., 

”C 

k,  sec'1 

Nonlinear 

estimation 

D’Alessandro 
and  Farkas 

340 

0.52 

0.56 

375 

1.84 

1.70 

410 

7.40 

7.36 

450 

21.3 

20.4 

475 

34.8 

32.9 

The  second  phase  of  this  study  dealing  with  an  analysis  of  the 
overall  reaction  mechanism  was  next  initiated.  The  mechanism  con¬ 
sidered  most  plausible  by  D’Alessandro  and  Farkas  was  given  by 
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To  analyze  this  and  possibly  other  mechanisms,  nonlinear  esti¬ 
mation  was  applied  to  the  experimental  data  by  proceeding  from  sim¬ 
ple  mechanisms  to  increasingly  more  complex  ones.  The  purpose  of 
this  cautious  approach  was  to  treat  mechanisms  with  few  parameters 
at  the  outset  which  could  then  be  augmented  by  additional  parameters 
as  the  analysis  continued.  In  this  way  control  could  be  exercised  to 
avoid  difficulties  in  convergence  of  the  nonlinear  estimation  tech¬ 
nique  for  those  parameter  estimates  which  were  poorly  determined. 
The  complete  analysis  is  shown  in  Table  2,  where  the  circled  species 
indicate  that  measurements  on  these  species  were  taken  into  account 
over  the  whole  temperature  range.  Iji  Table  2  k^',  defined  as  k„'  =  k„ 
exp{-E/RT},  was  evaluated  with  1/T  =  1.42  x  10'3  “K"1.  Further¬ 
more,  since  values  of  [N]  were  derived  by  differences,  these  values 
are  not  independent  of  other  measured  concentrations.  As  a  result 
"observations”  on  [N]  were  excluded.  In  all,  96  observations  were 
used  resulting  from  the  24  observations  made  on  each  of  the  four 
chemical  species. 

Several  conclusions  can  be  drawn  from  these  tabulated  results. 
First  and  foremost  is  that  a  slightly  abbreviated  form  of  the  pro¬ 
posed  mechanism  is  an  adequate  representation  of  the  data: 


The  overall  fit  as  measured  by  the  standard  deviation  of  residuals 
s  is  satisfactory  when  compared  to  values  of  s  for  other  mechanisms. 
Also,  the  least-squares  parameter  estimates  0  obtained  appear  fairly 
well  determinc?d  as  indicated  by  the  values  of  the  associated  s.  andard 
deviations  of  the  parameter  estimates  s g. 

These  results  were  confirmed  further  by  considering  a  number  of 
alternative  mechanisms  including  the  originally  proposed  one  and  one 
suggested  by  Mars  and  van  Krevelen  [56]  in  which  the  rate-determin¬ 
ing  step  is  the  chemisorption  of  oxygen.  Without  presenting  the  ex¬ 
plicit  data  here,  we  merely  state  that  these  additional  calculations 
showed  that  the  mechanism  proposed  by  D 'Alessandro  and  Farkas 
was  an  adequate  representation  of  the  experimental  data;  further¬ 
more,  the  reaction  rates  and  activation  energies  for  the  primary 
branching  of  naphthalene  to  phthalic  anhydride  and  naphthoquinone 
were  not  equal,  particularly  at  the  higher  temperatures,  in  contrast 
to  the  values  determined  by  D’Alessandro  and  Farkas  which  were 
equal  over  the  temperature  range.  However,  it  was  also  shown  that 
an  equally  acceptable  model  is  that  proposed  by  Ioffe  and  Sherman 
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TABLE  2 


k{  =  k0  exp(-E/RT) 

Mechanism  1  2  3  4  5  6 


8.79  0.842 

0.26  0.163 


9.23  0.873  0.833 
'.26  0.146  0.151 


8.10  2.16  0.882 

0.61  0.59  0.142 


8.10  3.00  0.881 

0.73  0.74  0.124 


8.26  3.55  0.926 

0.88  1.11  0.142 


8.09  2.76  1.20 

0.72  0.74  0.48 


16.8 

7.8 


28.0  0.831 

8.9  0.221 


31.3  0.750 

10.4  0.236 


29.2  0.861  -4.15 
9.0  0.227  6.10 


[40]  in  which  naphthalene  undergoes  direct  and  simultaneous  oxida¬ 
tion  to  phthalic  anhydride  and  naphthoquinone,  but  not  to  evolved 
gases,  i.e.,  involving  a  binary  branching 


Ioffe  and  Sherman  similarly  studied  the  vapor-phase  oxidation  of 
naphthalene  in  the  presence  of  vanadium  oxide  catalyst.  For  fixed 
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Mechanism  Evaluation 


E' 

=  E/R 

1 

2 

3 

4 

5 

6 

a 

s 

14.5 

12.4 

0.4 

2.8 

0.0326 

14.6 

12.7 

9.68 

0.4 

2.4 

2.34 

0.0278 

15.0 

11.9 

9.80 

7.41 

1.0 

3.2 

2.09 

5.60 

0.0259 

15.6 

13.0 

8.86 

9.69 

16.8 

0.0217 

1.1 

2.6 

3.96 

1.80 

3.5 

15.9 

13.5 

9.93 

17.6 

9.05 

1.13 

0.0218 

1.3 

2.8 

1.85 

3.9 

3.8 

0.15 

15.6 

12.8 

11.1 

16.6 

8.84 

10.9 

1.1 

2.7 

3.2. 

3.6 

3.90 

12.1 

0.0219 

initial  concentrations  of  naphthalene  and  air  of  2.69  x  10'4  and  8.75 
x  10’3  moles/liter,  respectively,  extensive  integral  conversion  data 
were  obtained  in  a  flow  reactor  under  constant  volume  and  tempera¬ 
ture  conditions  over  a  temperature  range  of  260-400°C  and  reaction 
times  of  0.047-0.499  sec. 

Measured  were  conversions  to  phthalic  anhydride  (P),  1,4-naphtho 
quinone  (Q),  maleic  anhydride  (M),  and  off-gases  (G).  Residual 
naphthalene  (N)  was  obtained  by  difference.  For  each  chemical  spe¬ 
cies,  four  to  seven  observations  were  recorded  at  each  of  nine  tem- 
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peratures.  In  all,  55  measurements  were  obtained  tor  each  ot  the 
chemical  species.  However,  maleic  anhydride  did  not  appear  in 
measurable  quantities  except  at  temperatures  o 1  380  and  400°C.  Be¬ 
cause  of  omissions  in  the  data  of  Ioffe  and  Sherman,  some  adjust¬ 
ments  were  necessary  in  this  study  at  the  higher  temperatures. 

Constant-temperature  and  then  temperature-dependent  effects 
were  analyzed  by  nonlinear  estimation  with  the  order  of  the  reaction 
both  prescribed  as  first  order  and  left  unassigned  to  be  estimated 
from  the  data.  The  results  obtained  were  in  substantial  agreement 
with  the  proposals  of  Ioffe  and  Sherman,  the  kinetics  being  plausibly 
first  order. 

Proceeding  further,  Ioffe  and  Sherman  suggested  that  the  activa¬ 
tion  energy  is  not  constant  with  temperature,  but  decreases  from  an 
initial  value  of  27.4  kcal/mole  as  the  temperature  rises  and  that 
while  first-order  kinetics  dominated  at  low  temperatures,  internal 
pore  diffusion  became  increasingly  important  until  at  about  SSOt 
half-order  kinetics  prevailed.  To  test  this  assertion,  data  from  prox¬ 
imate  temperature  intervals  were  examined  by  nonlinear  estimation 
for  both  first-order  and  unassigned-order  kinetics.  The  results  are 
shown  in  Table  3. 


TABLE  3 


Temperature- Dependent  Kinetics, 
Proximate  Temperature  Intervals 


Temp, 
range,  °C 

First-order 

kinetics, 

E,  kcal/mole 

Unassigned-order  kinetics 

E 

Se 

O! 

sa 

s 

260-270 

28.2 

26.6 

3.6 

0.77 

0.24 

0.023 

270-290 

21.6 

19.7 

1.8 

0.59 

0.12 

0.030 

290-310 

11.2 

10.9 

2.8 

0.91 

0.13 

0.046 

310-330 

17.4 

17.4 

2.7 

1.15 

0.12 

0.039 

330-350 

14.0 

14.0 

3.2 

1.06 

0.16 

0.042 

350-360 

13.2 

12.7 

4.9 

0.85 

0.12 

0.029 

360-380 

12.8 

11.4 

3.7 

0.76 

0.15 

0.040 

380-400 

-1.9 

-0.8 

1.1 

0.32 

0.16 

0.019 

Estimates  of  the  activation  energy  confirm  the  systematic  trend 
with  temperature.  The  results  also  show  that  a  significant  change 
occurs  in  the  vicinity  of  400°C.  Pinchbeck  [60],  using  a  similar  cat¬ 
alyst  in  a  fluidized  bed,  observed  this  effect  at  about  the  same  tem¬ 
perature  and  attributed  the  departure  to  a  modification  of  the  mech¬ 
anism  of  reaction.  Usbakova  et  al.  [68]  demonstrated  that  the  mech- 
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anism  is  dependent  on  the  valence  state  of  vanadium  and  that  the 
state  of  the  oxide  is  not  uniform  throughout  the  catalyst.  Tandy  [67], 
in  an  investigation  of  catalysts  for  the  oxidation  of  sulfur  dioxide  to 
trioxide,  reported  an  approximate  melting  point  of  400°C  for  mix¬ 
tures  of  vanadium  oxide -potassium  sulfate.  The  behavior  at  this 
elevated  temperature  is  therefore  presumed  to  be  related  to  a 
marked  alteration  in  the  physical  character  of  the  catalyst. 

Except  at  the  highest  temperature,  estimates  of  the  order  of  re¬ 
action  do  not  confirm  the  half-order  kinetics  established  by  Ioffe 
and  Sherman,  but  rather  support  largely  first-order  kinetics  at  both 
low  and  high  temperatures.  However,  the  activation  energy  decreases 
rapidly  with  temperature  to  about  one-half  the  initial  value.  As 
noted  by  Wheeler  [70],  this  behavior  is  consistent  with  diffusion  oc¬ 
curring  in  the  pores  of  the  catalyst.  Because  internal  pore  diffusion 
is  present,  systematic  deviations  are  evident  at  the  lowest  tempera¬ 
ture,  but  the  anomaly  at  the  highest  temperature  is  obscured  by  the 
comparatively  few  observations  made. 

As  before,  the  next  step  in  the  analysis  was  consideration  of  the 
overall  reaction  mechanism.  Nonlinear  estimation  analysis  was 
carried  out  as  previously,  starting  with  simpler  mechanisms  and 
terminating  with  the  most  complex  mechanism  the  data  could  sup¬ 
port.  Also,  as  in  the  previous  analysis,  “observations”  on  [N]  were 
excluded,  yielding  a  total  number  of  220  observations,  with  55  mea¬ 
surements  made  on  each  of  four  species,  kg' was  evaluated  at  1/T 
=  1.67  x  10'3  °K"1. 

A  complete  tabulation  of  results  is  given  in  Table  4  where  s§ 
are  omitted  for  all  but  the  terminal  mechanisms.  The  temperature 
range  was  initially  largely  restricted  to  260-350°C  until  maleic  an¬ 
hydride  was  introduced  as  a  chemical  species,  at  which  point  the 
temperature  range  was  extended  to  260-400°C.  These  results  indi¬ 
cate  that  the  mechanism  proposed  by  Ioffe  and  Sherman  [Eq.  (43)]  is 
plausible.  As  shown  previously  there  is  the  possibility  that  other 
mechanisms  such  as  Eq.  (42)  will  also  fit  the  data.  The  results  for 
the  two  mechanisms,  while  not  shown  here,  confirm  that  each  fits 
the  data  equally  well. 

Using  these  results  and  others  obtained  by  Peterson,  the  following 
conclusions  can  be  drawn  from  the  nonlinear  estimation  analysis: 

(1)  Internal  pore  diffusion  is  a  plausible  explanation  of  the  variation 
of  activation  energy  with  temperature,  although  contrary  to  the  anal¬ 
ysis  of  Ioffe  and  Sherman,  first-order  kinetics  largely  dominate 
throughout  the  temperature  range.  (2)  The  mechanism  proposed  is 
an  adequate  representation  of  the  data,  but  equally  acceptable  is  that 
proposed  by  D’Alessandro  and  Farkas  in  which  naphthak  ve  also 
undergoes  simultaneous  oxidation  directly  to  evolv’d  gases.  (3)  The 
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TABLE  4 

Temp. 

range, 

Mechanism 

•c 

l 

2 

3 

4 

260-400 

8 

3.84 

0.150 

N  — “©) 

260-400 

§ 

2.96 

8.65 

N 

260-350 

8 

5.08 

1.47 

1  J§> 

N  1 5 

260-350 

8 

4.40 

3.37 

3.44 

1 

N  t5 

260-350 

8 

4.32 

3.79 

0.73 

N  ]5 

260-350 

8 

4.37 

3.86 

0.94 

260-400 

8 

4.03 

0.014 

3.15 

0.98 

s  8 

0.19 

0.013 

0.21 

0.18 

N<*'^t5 

260-400 

8 

8e 

4.01 

0.27 

0.012 

0.012 

3.02 

0.30 

1.02 

0.19 

fit  of  the  data  for  1,4-naphthoquinone  is  suspect  at  temperatures  in 
the  vicinity  of  350°C  and  suggests  the  presence  of  a  closely  related 
reaction  product,  1,2-naphthoquinone,  undetermined  by  Ioffe  and 
Sherman,  but  later  reported  by  Shelstad  [66]. 

The  above  has  been  concerned  with  homogeneous  kinetic  studies; 
by  contrast,  Lapidus  and  Peterson  [52,59]  considered  the  heterogen¬ 
eous  case  in  which  Langmuir-Hinshelwood  relations  were  used  to 
describe  the  rcte-controlling  mechanism.  Two  studies  were  under- 
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Mechanism  Evaluation 


E' 

O' 

8 

5 

1 

2 

3 

4 

5 

7.27 

10.7 

0.062 

6.26 

6  62 

0.045 

8.81 

4,98 

0  067 

9.78 

7.72 

3.21 

0.201 

3.52 

9.92 

7.98 

8.54 

2.80 

0.042 

3.45 

9.83 

7.83 

4.18 

3.02 

0.036 

3.51 

8.41 

22.7 

6.34 

5.79 

1.01 

0.042 

0.45 

0.40 

5.3 

0.58 

1.62 

1.36 

3.28 

8.14 

23.4 

6.45 

5.41 

4.69 

0.96 

0  042 

0.48 

0.47 

5.6 

0.60 

1.61 

1.40 

0.11 

taken,  both  relating  to  ethanol.  In  the  first  study  the  experimental 
data  of  Kabel  [41]  on  ethanol  dehydration  were  analyzed;  in  the  sec* 
ond  study  the  data  of  Franckaerts  and  Froment  [26]  on  the  dehydro¬ 
genation  of  ethanol  were  analyzed.  In  both  cases  extensive  integral 
conversion  data  were  available. 

In  the  first  study  four  different  models  were  used  to  represent 
the  experimental  data.  Those  included  three  heterogeneous  models 
and  one  homogeneous  model.  In  the  nonlinear  estimation  analysis, 
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a  preliminary  screening  of  data  was  made  made  of  various  mechanisms 
on  the  basis  of  obtaining  random  residuals  and  a  suitable  residual 
error  variance.  Several  subsets  of  the  original  data  were  identified 
for  further  study.  A  homogeneous -type  rate  mechanism  was  found  to 
be  as  adequate  in  describing  the  data  as  the  heterogeneous  mecha¬ 
nisms.  Using  only  heterogeneous  mechanisms,  it  proved  impossible 
to  conclude  that  any  one  mechanism  was  more  suitable  than  another 
unless  external  information  was  also  used.  This  conclusion,  while 
not  unknown  to  researchers  in  kinetics  and  mechanism  studies,  may 
seem  rather  surprising  since  a  large  amount  of  data  was  available 
for  estimating  relatively  few  parameters. 

It  was  thus  concluded  that  no  discrimination  among  mechanisms 
was  possible,  presumably  as  a  result  of  either  or  both  of  these  ef¬ 
fects:  (1)  The  experimental  data  examined  were  constrained  to  be  at 
only  one  constant  pressure.  (2)  An  accidental  arrangement  of  the 
physical  parameters  led  to  a  degeneracy  in  the  rate  equations  for 
various  heterogeneous  mechanisms,  which  therefore  became  equiva¬ 
lent  to  the  homogeneous  moael. 

In  the  second  study,  nonlinear  estimation  led  to  substantially  the 
same  results  as  the  initial  rate  method  combined  with  linear  estima¬ 
tion.  Since  fewer  assumptions  and  transformations  were  involved  in 
the  analysis  by  nonlinear  estimation,  the  results  are  possibly  more 
valid. 

It  has  also  been  shown  that  data  at  different  pressures  as  well  as 
different  feed  compositions  measurably  improve  the  ability  to  dis¬ 
criminate  among  mechanisms  and  provide  suitable  estimates  for 
parameters.  Constant -pressure  data  alone  do  not  appear  to  be  ade¬ 
quate  for  the  more  definitive  study  of  kinetics  and  mechanism. 

In  a  most  interesting  and  important  series  of  papers  on  kinetics 
modeling,  Kittrell  et  al.  [42-48]  first  analyzed  the  isothermal  and 
nonisothermal  heterogeneous  reduction  of  nitric  oxide  via  Langmuir- 
Hinshelwood  models.  Three  different  models  were  postulated  and 
then  approximated  to  the  experimental  data  by  linear  and  by  nonlinear 
least  squares.  Comparison  of  the  two  approaches  showed  that  non¬ 
linear  least  squares  was  more  useful  for  a  rational  selection  of  an 
acceptable  single  model  and  estimation  of  its  parameters.  This  was 
particularly  true  if  the  rate  equations  determined  by  each  procedure 
were  extrapolated  beyond  the  actual  regions  of  the  experimental  data; 
here  the  two  approaches  yielded  rate  curves  which  differed  quite 
widely.  Also,  these  authors  indicated  explicitly  the  need  for  further 
data  to  reduce  the  confidence  regions  for  the  estimated  parameters 
and  make  the  analysis  even  more  efficient. 

In  a  summary  p^per  [49]  these  same  authors  surveyed  model 
building  techniques  in  general  with  a  primary  emphasis  on  mecha- 
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nistic  models.  This  work  represents  a  convenient  outline  of  problems 
involved  in  model  >  apresentation  of  kinetic  systems,  the  methods  of 
solution,  and  the  1  esults  obtained. 

VI.  DESIGN  OF  EXPERIMENTS  FOR  MODEL  DISCRIMINATION 

It  was  stated  above  that  our  efforts  to  estimate  parameters  for 
a  model  may  be  defeated  by  data  obtained  from  poorly  designed  ex¬ 
periments.  The  same  statement  applies  with  even  greater  force  to 
the  problem  of  selecting  the  best  model  from  among  a  set  of  candi¬ 
dates. 

As  was  the  case  with  parameter  estimation,  it  is  the  experimen¬ 
ter’s  task  at  each  stage  to  seek  out  that  experiment  which  is  likely 
to  yield  the  greatest  amount  of  relevant  information.  For  the  pur¬ 
pose  of  discriminating  between  two  proposed  models,  a  measure  of 
the  relevant  information  is  (see  Kullback  [50]) 

1 )  n(2  » 

I  =  E<lllogt-—  +  E(2,log2—  (44) 

p(z)  p(u 

where  E*1’  denotes  exnectation  under  the  assumption  that  model  i  is 
true,  and  p<l)  is  the  probability  density  function  under  the  same  as¬ 
sumption  (i  =  1,  2). 

Suppose  we  have  performed  N-l  experiments  and  have  fitted  the 
data  to  both  models  1  and  2,  yielding  parameter  sets  0<l)  and  0t2\ 

Let  xN  represent  the  values  of  the  independent  variables  for  a  pro¬ 
posed  Nth  experiment.  We  can  compute  the  predicted  values  yi})  and 
yKi2)  of  the  observed  variables  for  the  proposed  experiments  under 
hypothesis  1  and  2,  respectively: 

yy>  =  f“>[xN,0“'] 

yff1  =  f<2,[xN,0<2)] 

Let  the  covariance  matrices  of  these  two  predictions  be  V<u  and 
V(2),  respectively,  that  is, 

E<l,[yN  -  y^1’]  [yN-yN(i,]T  =Vd)  (i  =  i.2) 

Furthermore,  let  U(U  =  [V(1,]M.  Assuming  normal  distributions,  it 
is  easily  shown  that  Eq.  (44)  reduces  to 

I  =  -r  +  |Tr[U<2'Vu’]  +  jTr[Uu,V(2)]  (45) 

+  i[yy,-yif,]T[u<l>  +  iHlytf’-yff’] 
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where 


Tr(UV)  =EUijVij 

i.j 

Application  of  Eq.  (45)  requires  estimation  of  the  matrices  V(i> 
which  are  measures  of  the  uncertainties  in  the  predicted  values  y$K 
There  are  two  sources  for  these  uncertainties: 

1.  The  uncertainty  in  the  values  of  the  estimated  parameters  0(l), 
measured  by  means  of  the  covariance  matrix  of  the  estimates  W(i). 

2.  The  difference  between  the  measured  values  of  the  variables 
yf/’  and  their  true  values,  due  to  experimental  errors.  A  measure  of 
these  errors  is  given  by  the  covariance  matrix  of  the  residuals  R<*\ 

The  matrices  W(i>  and  R(i)  are  obtained  as  by-products  of  esti¬ 
mating  the  parameters  for  the  ith  model,  based  on  the  experiments 
already  conducted.  Now,  assuming  that  these  two  causes  bring  about 
independent  errors,  we  have  approximately 


y(i)  =  £<i) 


(i  =1,2) 


(46) 


or,  written  out  in  full, 


An  interesting  case  is  that  of  a  single  observed  variable,  where 
V,')  and  R**’  are  single  numbers,  cr?  and  s’,  respectively,  with 


of  =  sj  +  £  W,1 


Ml  M< 


(ll 


p»q 


pq 


30p  ’ 


9*qU 


and  Eq.  (45)  becomes 


If  ol  and  Oj  do  not  vary  much  from  one  set  of  experimental  conditions 
to  another,  the  information  is  essentially  proportional  to  fy^’  -  y^2’]*, 
i.e.,  to  the  square  of  the  difference  between  the  predicted  values  un¬ 
der  the  two  hypothesis.  The  experiment  to  be  performed  is  the  one 
for  which  this  difference  is  the  greatest-a  common  sense  result. 
More  generally  we  seek  those  feasible  values  of  the  independent  vari- 
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ables  for  which  the  information  function  [Eqs.  (44),  (45),  or  (47)]  at¬ 
tains  its  maximum. 

Equation  (47)  has  been  derived  by  Box  and  Hill  [13],  who  have 
been  the  pioneers  in  the  application  of  sequential  design  to  kinetics 
experimentation.  Hill  and  Hunter  also  provide  generalizations  for 
discrimination  among  more  than  two  models  and  for  combining  the 
parameter  estimation  and  model  discrimination  criteria  in  a  single 
function  [35].  The  idea  here  is  to  form  a  linear  combination  of  the 
two  criteria,  which  is  initially  weighted  to  favor  model  discrimina¬ 
tion.  As  one  model  becomes  increasingly  favored,  the  weight  is 
shifted  toward  estimating  the  parameters  for  that  model. 

A.  Termination  of  a  Sequence  of  Experiments 

The  maximum  information  principle  permits  us  to  choose  the  ex¬ 
perimental  conditions  for  the  Nth  experiment  after  N-l  experiments 
have  been  performed.  We  need  a  criterion  to  decide  whether  the  Nth 
experiment  should  be  performed  at  all,  or  whether  we  may  already 
prefer  one  of  the  models  with  a  sufficient  degree  of  certainty.  Such 
a  criterion  is  provided  by  Wald’s  likelihood  ratio  test  [69]:  Let  L(i) 
be  the  maximum  value  of  the  likelihood  function  based  on  model  i 
using  the  data  obtained  to  date.  If  p  =  LUI/L<2>,  then  we  adopt  the 
following  procedure: 

If  p  ^  J~a’  we  accept  nu>del  1  with  confidence  a 


If  p  ^  *  Q  a ,  we  accept  model  2  with  confidence  a 

If  1  a  a  <p  t  °  q  ,  we  continue  experimentation 

Note:  for  a  confidence  level  of  99%,  we  set  a  =  0.99. 

In  the  case  of  a  single  observed  variable  with  a  normal  distribu¬ 
tion,  we  have,  after  N  experiments, 

p  *(5*)  exp±(p,-p2) 

where  oj  is  the  standard  deviation  of  the  residuals  and  pi  the  number 
of  parameters  in  model  i. 

B.  Numerical  Example 

There  do  not  seem  to  be,  as  yet,  any  published  results  of  kinetics 
experiments  actually  carried  out  according  to  the  above-detailed 
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prescriptions.  Tests  on  computer-simulated  experiments,  however, 
show  that  use  of  the  proper  experimental  design  method  can  increase 
spectacularly  our  ability  to  select  a  proper  model.  In  a  computer  - 
simulated  experiment,  a  computer  subroutine  replaces  the  laboratory 
equipment.  The  selected  experimental  conditions  (values  of  the  in¬ 
dependent  variables)  are  fed  as  input  to  the  subroutine,  which  com¬ 
putes  the  hypothetical  results  of  the  experiment,  using  one  of  the  al¬ 
ternative  models.  The  subroutine  then  adds  to  these  results  a  pseudo¬ 
random  variable  having  prescribed  statistical  properties  (e.g.,  vari¬ 
ance).  These  results  are  accepted  by  the  parameter  estimation  and 
experimental  design  programs  as  though  they  had  been  produced  by 
an  actual  experiment 

A  system  selected  for  simulation  was  that  of  the  catalytic  hydro¬ 
genation  of  mixed  isooctenes,  described  by  Hougen  and  Watson  [37], 
and  further  analyzed  by  Blakemore  and  Hoerl  [5].  The  latter  authors 
reduced  the  number  of  acceptable  rate  equations  to  two: 

_ e[llx1xa _ 

Model  1.  y  (l  +  +  ^uXj  +  ^nx3)3 


Model  2:  y'2’ 


e{2  'xjXj 

(i  +0j2’x1  +e‘2,x2  +0[2’x3)2 


where  y  is  the  rate  of  the  reaction*  and  x„  x2,  and  x3  are,  respec¬ 
tively,  the  partial  pressures  of  hydrogen,  isooctene,  and  isooctane. 
The  available  experimental  data  were  insufficient  for  preferring  one 
of  these  models.  For  the  computer-simulated  experiments,  the  par¬ 
tial  pressures  of  the  reactants  were  confined  to  the  same  region  as 
in  the  data  used  by  Blakemore  and  Hoerl,  namely, 


0.1  £  x1  £  2.5 
0.1  £  x2  £  3.0 
0.05  ^  x3  ^  2.7 


For  the  first  six  experiments,  the  fractional  design  given  in  Table 
5  was  used  and  the  results  (i.e.,  the  rate  of  the  reaction  y)  were  com¬ 
puted  assuming  model  1  was  correct.  Subsequent  experiments  were 
chosen  so  as  to  maximize  the  information  measure  given  by  Eq.  (47) 
[in  this  case,  at  least,  precisely  the  same  results  are  obtained  by 
maximizing  (yf}>  -  yj,2>)2].  The  results  of  those  experiments  were 
computed  assuming  model  2  was  correct;  in  this  way  one  desired  to 
find  out  how  soon  the  sequence  of  experiments  would  overcome  the 
“misleading”  results  of  the  first  six  experiments  and  choose  model  2 
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TABLE  5 


Experi- 


ment 

xl 

x2 

*3 

1 

0.1 

1.55 

1.375 

2 

2.5 

1.55 

1.375 

3 

1.3 

0.1 

1.375 

4 

1.3 

3.0 

1.375 

5 

1.3 

1.55 

0.05 

6 

1.3 

1.55 

2.7 

as  the  correct  one.  A  total  of  27  experiments  were  “performed.” 
Also,  27  experiments  constituting  a  3  x  3  x  3  factorial  design  were 
carried  out.  Table  6  compares  the  levels  of  confidence  in  model  2 
achieved  after  27  experiments  at  various  levels  of  simulated  exper¬ 
imental  error. 

As  expected,  the  discrimination  power  of  each  procedure  dim¬ 
inishes  with  increasing  experimental  error;  the  factorial  designs 
fail  completely  to  discriminate  at  3%  error,  whereas  the  sequential 
design  fails  only  at  10%.  Below  this  level  the  performance  of  the  se¬ 
quential  design  is  spectacularly  better  than  that  of  the  factorial. 

It  is  interesting  to  list  the  number  erf  experiments  that  were 
needed  to  reach  a  90%  confidence  level  by  the  sequential  design. 
These  wore  15,  17,  and  30  for  experimental  errors  of  1,  3.  and  6%, 
respectively.  In  contrast,  the  factorial  design  never  exceeded  a  fiO 
level. 

The  above  results  show  that  sequential  experimental  design  holds 
great  promise.  Whether  this  promise  will  be  fulfilled  in  practice  is 
not  yet  known.  The  main  theoretical  question  that  arises  when  these 
methods  are  applied  in  practice  arises  from  the  fact  that  rarely  is 
any  of  the  proposed  mechanisms  exactly  correct,  and  therefore  one 


TABLE  6 


Experi¬ 

mental 

error. 

Confidence  in  model  2 
after  27  experiments,  ^ 

Factorial 

design 

Sequential 

design 

1 

60 

99.8 

3 

50 

98 

6 

50 

80 

10 

50 

50 
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should  not  exaggerate  the  confidence  placed  on  any  model  selected  as 
a  result  of  analyzing  the  data  obtained  in  a  series  of  experiments. 
Clearly,  if  the  correct  model  is  not  among  those  proposed,  it  cannot 
be  the  one  selected.  Frequently,  however,  the  data  themselves  may  lead 
to  systematic  modifications  of  the  model,  until  a  suitable  model  is 
found.  Such  an  analysis  is  described  in  detail  by  Hunter  and  co¬ 
workers  [38,44]. 


APPENDIX 

There  is  currently  available  a  number  of  computer  programs 
which  include  some  or  all  of  the  features  discussed  in  this  paper.  It 
is  felt  worthwhile  to  point  out  briefly  these  programs  and  their  fea¬ 
tures. 

1.  The  Lapidus-Peterson  program  suitable  for  the  IBM  7090/94 
computers  [51].  In  brief,  this  program,  which  is  an  extension  of  an 
earlier  nonlinear  estimation  program  [7],  consists  of  three  concep¬ 
tual  parts  which  are  linked  together  to  perform  the  necessary  com¬ 
putations:  (a)  a  kinetics  language  for  input  of  the  kinetics  reaction 
model  and  experimental  data;  (b)  a  differential  equation  solver  for 
numerical  integration  of  the  rate  equations;  and  (c)  a  nonlinear  esti¬ 
mation  algorithm  for  obtaining  least-squares  estimates  of  the  para¬ 
meters  via  the  Gauss  method.  First  derivatives  are  obtained  in  the 
program  by  finite  difference  approximation. 

This  program  has  been  found  very  advantageous  since  the  input 
is  entered  in  a  manner  which  is  close  to  the  kineticist’s  way  of  think¬ 
ing  and  the  emphasis  is  placed  on  the  physical  chemistry  aspects 
rather  than  the  mathematical  formulation.  Thus  the  investigator  en¬ 
ters  card  formats  which  typically  read  as  follows: 

Program  entry  meaning 

A  =  R  +  S  A-R+S 

Surface  reaction  control  surface-reaction 

controlling  model 

(1  +(  )0.28A*0.5)*2  denominator  term 

in  rate  expression 
(1  +  0.28p°A5)! 

Here  KA  =  0.28  is  a  starting  value  in  the  nonlinear  estimation  of  KA. 
a  =  0.5  and  /3  =  2.0  are  considered  fixed  in  this  particular  format 
statement,  but  may  also  be  estimated  if  desired. 
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From  this  input  and  additional  data  such  as  values  of  initial  con¬ 
centrations,  settings  of  the  independent  variables  and  the  observa¬ 
tions,  the  computer  program  automatically  (a)  develops  the  kinetic 
rate  expression;  (b)  integrates  this  expression  numerically  to  yield, 
in  effect,  a  specific  form  of  the  model  equation;  and  (c)  provides  param¬ 
eter  estimates  which  fit  the  observations  in  the  least-squares  sense. 

As  output  from  the  program,  various  statistics  are  produced. 
Among  those  found  useful  are  the  parameter  estimates  0,  the  stan¬ 
dard  deviations  of  the  parameter  estimates  &$,  the  standard  deviation 
of  the  errors  s,  and  the  residuals  between  the  observed  and  calcu¬ 
lated  values  of  the  dependent  variable  e. 

2.  The  Eisenpress-Greenstadt  nonlinear  maximum  likelihood  pro¬ 
gram  suitable  for  the  IBM  7090/94  computers  [23].  This  program  was 
developed  originally  for  large-scale  economic  problems  and  has  no 
provisions  for  integrating  differential  equations.  Therefore  it  is  ap¬ 
plicable  to  kinetics  problems  only  where  differential  rate  measure¬ 
ments  are  available  or  when  analytic  solutions  to  the  rate  equations 
are  available.  This  program  maximizes  the  likelihood  function  given 
by  Eq.  (24),  corresponding  to  the  following  assumptions:  (a)  errors 
are  normally  distributed;  (b)  errors  in  different  experiments  are 
independent;  and  (c)  the  same  unknown  covariance  matrix  applies  to 
the  different  observed  variables  in  each  experiment.  This  matrix  is 
estimated  along  with  the  parameters. 

The  program  uses  Greenstadt’s  modification  of  the  Newton-Raph- 
son  method,  as  described  previously.  This  requires  the  evaluation 
of  the  first  and  second  derivatives.  The  computer  itself,  using  the 
FORMAC  system  of  algebraic  formula  manipulations  [6],  performs 
all  the  required  differentiations  analytically. 

3.  The  Bard  nonlinear  parameter  estimation  program  [3J,  suit¬ 
able  for  any  computer  accepting  FORTRAN  IV  programs.  It  solves 
least-squares,  weighted  least -squares,  maximum  likelihood,  and 
Bayesian  estimation  problems  of  the  types  discussed  above.  It  in¬ 
corporates  an  integration  routine  and  special  routines  for  generating 
the  rate  equations  and  their  derivatives  in  kinetics  problems.  It  uses 
the  generalized  Gauss-Newton  method,  or,  optionally,  the  Davidon- 
Fletcher-Powell  method  [20,25]. 

4.  The  Marguardt  program  [55],  written  in  FORTRAN  IV.  It  uses 
Marquardt’s  method  [54]  and  may  be  coupled  easily  to  integration 
routines. 
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